Longitudinal COVID Risk/Reward Behaviors & Mental Health Modeling
This project aims to examine longitudinal changes in risk- and reward-related behaviors and decision-making among a general community sample over the first year following the initial COVID-19 lock-down in the United States, which began in March 2020. The focus is to understand how these behaviors evolved during the pandemic and to identify the factors influencing these trajectories. This study was conducted by the Computational Psychopathology Laboratory at McLean Hospital & Harvard Medical School.
Author: Adrian Medina
Date: October 4, 2024
Overview
Study Design: The study spans from May 2020 to February 2021, encompassing seven sessions including the baseline. Follow-up assessments were conducted at two weeks, one month, six weeks, two months, ten weeks, and three months post-baseline. This design allows for detailed tracking of changes over a crucial period of adjustment to pandemic-related disruptions.
Data Collection: Data were collected on a variety of domains, including:
Risky Behaviors: As characterized by the Centers for Disease Control and Prevention (CDC).
Risk/Reward Appraisals: Assessment of individuals’ perceptions and evaluations of risky situations versus potential gains
Clinical Measures: Focused primarily on mental health, with a few variables related to physical health.
Coping & Support Measures: Insights into how individuals cope with stress and their available support systems.
Psychometric Measures: Standardized tests to measure psychological variables, offering a window into the psychological state of participants.
Analytic Approach: The primary analytic strategy involves using multilevel modeling to:
Determine the best-fitting models for describing the trajectories of risk/reward behavior variables.
Incorporate clinical, psychometric, coping, and support measures to enrich our understanding of the factors influencing risk and reward inclinations or aversions during this unprecedented period.
Goals: The primary analytic strategy involves using multilevel modeling to:
Identify key patterns and changes in risk-taking and reward-seeking behaviors across the studied time frame.
Explore the impact of psychological and social factors on these behaviors to better understand the mental health implications of the pandemic.
Develop predictive insights that can inform future public health strategies and interventions in similar crises.
Data Frame Initialization
Set up the R environment by configuring the CRAN repository, installing essential packages, and setting base paths.
# Set CRAN repository for consistent package installationoptions(repos =c(CRAN ="http://cran.rstudio.com/"))if (!require(pacman)) install.packages("pacman")library(pacman)# Use p_load function to install (if necessary) and load packagesp_load(psych, ggplot2, lme4, Matrix, lmerTest, nlme, dplyr, performance, interactions, sampling, tidyr, tidyverse, kableExtra, broom.mixed, gridExtra, sjPlot, ggridges, PupillometryR, patchwork)# Define base path for streamlined access/replicationbase_path <-"~/GitHub/COVID-MultilevelModeling/data_files"# Set working directorysetwd(base_path)# Unzip the file in the local directoryunzip("covid_long.csv.zip")# Read in risk/reward dataset, presently in long formatcovid_long <-read.csv("covid_long.csv")# Read in longitudinal mental health questionnaire data, presently in wide formatmentalhealth_survey_data <-read.csv("FINAL_SURVEY_DATA_2024-06-12 .csv")
1
this package provides efficient package management if using multiple packages within the same file!
The source dataset is loaded and time values are adjusted from 1-7 to 0-6. Initialize unified columns for behavior, risk, and reward variables.
Code
# Recode time from values of 1-7 to 0-6covid_long$time <- covid_long$Index1 -1# Define the behavior variable namesbeh_variable_names <-c("L1_mail", "HM26_plane", "L2_takeout", "L4_tennis", "LM7_walk_others", "LM8_golf", "MOD16_bbq", "HM24_eat_rest_in", "LM9_stay_hotel","LM11_library", "LM12_eat_rest_out", "LM14_playground", "H33_going_movies","L5_camping", "MOD17_beach", "MOD18_mall", "LM6_grocery", "MOD20_work_office", "HM28_play_football", "MOD21_pool", "H37_bar", "MOD22_visit_relatives", "LM13_walk_downtown", "HM23_hair_salon", "HM25_wedding", "L3_pump_gas", "HM27_play_basketball", "MOD15_dinner_friend", "H36_attend_religious", "HM29_shake_hands", "H30_buffet", "H31_gym", "H32_amusement_park", "LM10_doc_office", "H34_concert", "H35_going_sports", "MOD19_kids_school")# Create a new '_unified' column (to organize behavior data in non-iterative manner) for each behavior variablefor (var_name in beh_variable_names) { covid_long[paste0(var_name, "_unified")] <-NA}# Define the base risk variable namesrisk_variable_names <-c("L1Y_mail_risk", "HM26Y_plane_risk", "L2Y_takeout_risk", "L4Y_tennis_risk", "LM7Y_walk_others_risk", "LM8Y_golf_risk", "MOD16Y_bbq_risk", "HM24Y_eat_rest_in_risk", "LM9Y_stay_hotel_risk", "LM11Y_library_risk", "LM12Y_eat_rest_out_risk", "LM14Y_playground_risk", "H33Y_going_movies_risk", "L5Y_camping_risk","MOD17Y_beach_risk", "MOD18Y_mall_risk", "LM6Y_grocery_risk", "MOD20Y_work_office_risk", "HM28Y_play_football_risk", "MOD21Y_pool_risk", "H37Y_bar_risk", "MOD22Y_visit_relatives_risk", "LM13Y_walk_downtown_risk", "HM23Y_hair_salon_risk", "HM25Y_wedding_risk", "L3Y_pump_gas_risk", "HM27Y_play_basketball_risk", "MOD15Y_dinner_friend_risk", "H36Y_attend_religious_risk", "HM29Y_shake_hands_risk", "H30Y_buffet_risk", "H31Y_gym_risk", "H32Y_amusement_park_risk", "LM10Y_doc_office_risk", "H34Y_concert_risk", "H35Y_going_sports_risk" , "MOD19Y_kids_school_risk", "L1N_mail_risk", "HM26N_plane_risk", "L2N_takeout_risk", "L4N_tennis_risk", "LM7N_walk_others_risk", "LM8N_golf_risk", "MOD16N_bbq_risk", "HM24N_eat_rest_in_risk", "LM9N_stay_hotel_risk", "LM11N_library_risk", "LM12N_eat_rest_out_risk", "LM14N_playground_risk", "H33N_going_movies_risk", "L5N_camping_risk", "MOD17N_beach_risk", "MOD18N_mall_risk", "LM6N_grocery_risk", "MOD20N_work_office_risk", "HM28N_play_football_risk", "MOD21N_pool_risk", "H37N_bar_risk", "MOD22N_visit_relatives_risk", "LM13N_walk_downtown_risk", "HM23N_hair_salon_risk", "HM25N_wedding_risk", "L3N_pump_gas_risk", "HM27N_play_basketball_risk", "MOD15N_dinner_friend_risk", "H36N_attend_religious_risk", "HM29N_shake_hands_risk", "H30N_buffet_risk", "H31N_gym_risk", "H32N_amusement_park_risk", "LM10N_doc_office_risk", "H34N_concert_risk", "H35N_going_sports_risk" , "MOD19N_kids_school_risk")# Create a new '_unified' column (to organize risk data in non-iterative manner) for each behavior variablefor (var_name in risk_variable_names) { covid_long[paste0(var_name, "_unified")] <-NA}# Define the base reward variable namesrew_variable_names <-c("L1Y_mail_rew", "HM26Y_plane_rew", "L2Y_takeout_rew", "L4Y_tennis_rew","LM7Y_walk_others_rew", "LM8Y_golf_rew", "MOD16Y_bbq_rew", "HM24Y_eat_rest_in_rew", "LM9Y_stay_hotel_rew", "LM11Y_library_rew", "LM12Y_eat_rest_out_rew", "LM14Y_playground_rew", "H33Y_going_movies_rew", "L5Y_camping_rew", "MOD17Y_beach_rew", "MOD18Y_mall_rew", "LM6Y_grocery_rew", "MOD20Y_work_office_rew", "HM28Y_play_football_rew", "MOD21Y_pool_rew", "H37Y_bar_rew", "MOD22Y_visit_relatives_rew", "LM13Y_walk_downtown_rew", "HM23Y_hair_salon_rew", "HM25Y_wedding_rew", "L3Y_pump_gas_rew", "HM27Y_play_basketball_rew", "MOD15Y_dinner_friend_rew", "H36Y_attend_religious_rew", "HM29Y_shake_hands_rew", "H30Y_buffet_rew" , "H31Y_gym_rew", "H32Y_amusement_park_rew", "LM10Y_doc_office_rew", "H34Y_concert_rew", "H35Y_going_sports_rew", "MOD19Y_kids_school_rew", "L1N_mail_rew", "HM26N_plane_rew", "L2N_takeout_rew", "L4N_tennis_rew","LM7N_walk_others_rew", "LM8N_golf_rew", "MOD16N_bbq_rew", "HM24N_eat_rest_in_rew", "LM9N_stay_hotel_rew", "LM11N_library_rew", "LM12N_eat_rest_out_rew", "LM14N_playground_rew", "H33N_going_movies_rew", "L5N_camping_rew", "MOD17N_beach_rew", "MOD18N_mall_rew", "LM6N_grocery_rew", "MOD20N_work_office_rew", "HM28N_play_football_rew", "MOD21N_pool_rew", "H37N_bar_rew", "MOD22N_visit_relatives_rew", "LM13N_walk_downtown_rew", "HM23N_hair_salon_rew", "HM25N_wedding_rew", "L3N_pump_gas_rew", "HM27N_play_basketball_rew", "MOD15N_dinner_friend_rew", "H36N_attend_religious_rew", "HM29N_shake_hands_rew", "H30N_buffet_rew" , "H31N_gym_rew", "H32N_amusement_park_rew", "LM10N_doc_office_rew", "H34N_concert_rew", "H35N_going_sports_rew", "MOD19N_kids_school_rew")# Create a new '_unified' column (to organize reward data in non-iterative manner) for each behavior variablefor (var_name in rew_variable_names) { covid_long[paste0(var_name, "_unified")] <-NA}
The code below is meant to search through the source dataset ‘covid_long’ to determine whether or not each of the behavior, risk, and reward variables contain their longitudinal counterparts (based on the consistent naming scheme already included in the source dataset). This was done to ensure all variables were named correctly, thus ensuring they be included in modeling. The output for each check should be ‘character(0)’.
Code
# Define the naming scheme for follow-up sessionsfollow_up_suffixes <-c("", "_two_week", "_monthly", "_six_week", "_sec_monthly", "_ten_week", "_third_monthly")# Create a list of all expected columns based on the naming schemebeh_expected_columns <-unlist(lapply(beh_variable_names, function(var) {paste0(var, follow_up_suffixes)}))# Check which expected behavior columns are missing in the data framebeh_missing_columns <-setdiff(beh_expected_columns, names(covid_long))# Output the missing behavior columns (if any)print(beh_missing_columns)
character(0)
Code
# Create a list of all expected risk columns based on the naming schemerisk_expected_columns <-unlist(lapply(risk_variable_names, function(var) {paste0(var, follow_up_suffixes)}))# Check which expected risk columns are missing in the data framerisk_missing_columns <-setdiff(risk_expected_columns, names(covid_long))# Output the missing risk columns (if any)print(risk_missing_columns)
character(0)
Code
# Create a list of all expected reward columns based on the naming schemerew_expected_columns <-unlist(lapply(rew_variable_names, function(var) {paste0(var, follow_up_suffixes)}))# Check which expected reward columns are missing in the data framerew_missing_columns <-setdiff(rew_expected_columns, names(covid_long))# Output the missing reward columns (if any)print(rew_missing_columns)
character(0)
Filter out subjects with completely missing data across all unified variables, recode ‘NA’ values to ‘2’ in behavior variables, and populate unified columns according to specific time points, ensuring each column accurately reflects the data for each session.
Code
# Generate unified variable namesbeh_unified_vars <-paste0(beh_variable_names, "_unified")risk_unified_vars <-paste0(risk_variable_names, "_unified")rew_unified_vars <-paste0(rew_variable_names, "_unified")# Combine all unified variable namesall_unified_vars <-c(beh_unified_vars, risk_unified_vars, rew_unified_vars)# Combine all expected columnsall_expected_columns <-c(beh_expected_columns, risk_expected_columns, rew_expected_columns)# Create the subset dataframecovid_long_analytic <- covid_long %>%select(record_id, time, baseline_date_complete, all_of(c(all_expected_columns, all_unified_vars)))# Create complete expected columns per variable typebeh_all_columns <-c(beh_expected_columns, beh_unified_vars)risk_all_columns <-c(risk_expected_columns, risk_unified_vars)rew_all_columns <-c(rew_expected_columns, rew_unified_vars)all_columns <-c(all_expected_columns, all_unified_vars)# Filter out rows where all entries in 'all_columns' are NAcovid_long_filtered <- covid_long_analytic %>%filter(rowSums(!is.na(select(., all_of(all_columns)))) >0) # Tested using different filter variables, all of which yielded the same amount of subjects + iterations, so kept the most comprehensive option# Count NA values in each column of the filtered data framena_counts <-colSums(is.na(covid_long_filtered))# Recode NA values to 2 in specified behavior columns using explicit dplyr namespacecovid_long_filtered <- dplyr::mutate( covid_long_filtered, dplyr::across(dplyr::all_of(beh_expected_columns), ~ dplyr::if_else(is.na(.), 2, .)))# Double-check if there are still NA values in these columnssum_na <-invisible(sapply(covid_long_filtered[beh_expected_columns], function(x) sum(is.na(x))))print(sum_na) # This will show the count of NA values per column after the replacement
# Explicitly populate the unified behavior columns based on 'time' valuefor (var in beh_variable_names) {for (i in0:6) { covid_long_filtered <- dplyr::mutate( covid_long_filtered,!!paste0(var, "_unified") := dplyr::if_else( time == i, covid_long_filtered[[paste0(var, follow_up_suffixes[i +1])]], covid_long_filtered[[paste0(var, "_unified")]],missing =NA ) ) }}# Explicitly populate the unified risk columns based on 'time' valuefor (var in risk_variable_names) {for (i in0:6) { covid_long_filtered <- dplyr::mutate( covid_long_filtered,!!paste0(var, "_unified") := dplyr::if_else( time == i, covid_long_filtered[[paste0(var, follow_up_suffixes[i +1])]], covid_long_filtered[[paste0(var, "_unified")]],missing =NA ) ) }}# Explicitly populate the unified reward columns based on 'time' valuefor (var in rew_variable_names) {for (i in0:6) { covid_long_filtered <- dplyr::mutate( covid_long_filtered,!!paste0(var, "_unified") := dplyr::if_else( time == i, covid_long_filtered[[paste0(var, follow_up_suffixes[i +1])]], covid_long_filtered[[paste0(var, "_unified")]],missing =NA ) ) }}
Behavior Data: Wrangling & Transformations
Analyzes behavioral responses by filtering subjects with uniform responses and calculating counts of affirmative and negative responses. It summarizes these counts to derive total response metrics for each subject and time point, highlighting behavioral patterns.
Code
# Create the behavior subset data framecovid_behavior_filtered <- covid_long_filtered %>%select(record_id, time, baseline_date_complete, all_of(c(beh_unified_vars)))# Filter out empty rows of subject data (e.g., '2' in all behavioral variables)covid_behavior_filtered <- covid_behavior_filtered %>%# Add a helper column that checks if all values in specified columns are '2'mutate(all_twos =rowSums(across(all_of(beh_unified_vars)) ==2) ==length(beh_unified_vars)) %>%# Filter out rows where the helper column is TRUEfilter(!all_twos) %>%# Optionally remove the helper column if it's no longer neededselect(-all_twos)# Count only '0' values (yes) for each row across selected variables, excluding 'record_id' and 'time'covid_behavior_filtered$yes_counts <-rowSums(covid_behavior_filtered[, -c(1, 2)] ==0, na.rm =TRUE)# Count only '1' values (no) for each row across selected variables, excluding 'record_id', 'time', and 'yes_counts'covid_behavior_filtered$no_counts <-rowSums(covid_behavior_filtered[, -c(1, 2, 40)] ==1, na.rm =TRUE)# Sum the 'yes_counts' and 'no_counts' to get the total response countscovid_behavior_filtered$response_counts <- covid_behavior_filtered$yes_counts + covid_behavior_filtered$no_counts
Risk Data: Wrangling & Transformations
Processes risk-related data by excluding subjects with entirely missing risk variables. It calculates non-missing risk sums and counts to compute average risk scores, both total and adjusted for actual data availability, to assess risk profiles longitudinally.
Code
# Create the risk subset data framecovid_risk_filtered <- covid_long_filtered %>%select(record_id, time, baseline_date_complete, all_of(c(risk_unified_vars)))# Filter out empty rows of subject data (e.g., 'NA' in all risk variables)covid_risk_filtered <- covid_risk_filtered %>%# Add a helper column that checks if all values in specified columns are NAmutate(all_na =rowSums(is.na(across(all_of(risk_unified_vars)))) ==length(risk_unified_vars)) %>%# Filter out rows where the helper column is TRUEfilter(!all_na) %>%# Optionally remove the helper column if it's no longer neededselect(-all_na)# Create 'sum_risks' and 'num_risks' columns by summing all risk-related columns for each row, ignoring NA valuescovid_risk_filtered$sum_risks <-rowSums(covid_risk_filtered[risk_unified_vars], na.rm =TRUE)covid_risk_filtered$num_risks <-rowSums(!is.na(covid_risk_filtered[risk_unified_vars]))# Create 'avg_risk' column for risk variables by dividing 'sum_risks' by 'num_risks'covid_risk_filtered$avg_risk <- covid_risk_filtered$sum_risks / covid_risk_filtered$num_risks
Reward Data: Wrangling & Transformations
Processes reward-related data by isolating subjects with complete data and computing sums and counts of non-missing reward entries. It derives average reward scores, both overall and adjusted based on data presence, to analyze reward dynamics across time points. The section concludes by merging the refined reward data with previously processed behavioral and risk data to form a comprehensive dataset for further analysis.
Code
# Create the reward subset data framecovid_rew_filtered <- covid_long_filtered %>%select(record_id, time, baseline_date_complete, all_of(c(rew_unified_vars)))# Filter out empty rows of subject data (e.g., 'NA' in all reward variables)covid_rew_filtered <- covid_rew_filtered %>%# Add a helper column that checks if all values in specified columns are NAmutate(all_na =rowSums(is.na(across(all_of(rew_unified_vars)))) ==length(rew_unified_vars)) %>%# Filter out rows where the helper column is TRUEfilter(!all_na) %>%# Optionally remove the helper column if it's no longer neededselect(-all_na)# Create 'sum_rew' and 'num_rew' columns by summing all reward-related columns for each row, ignoring NA valuescovid_rew_filtered$sum_rew <-rowSums(covid_rew_filtered[rew_unified_vars], na.rm =TRUE)covid_rew_filtered$num_rew <-rowSums(!is.na(covid_rew_filtered[rew_unified_vars]))# Create 'avg_rew_adj' column for reward variables by dividing 'sum_rew' by 'num_rew'covid_rew_filtered$avg_rew <- covid_rew_filtered$sum_rew / covid_rew_filtered$num_rew# Merge covid_behavior_filtered with covid_risk_filteredcovid_long_intermediate <- dplyr::left_join(covid_behavior_filtered, covid_risk_filtered, by =c("record_id", "time"))# Merge the intermediate result with covid_rew_filteredcovid_long_final <- dplyr::left_join(covid_long_intermediate, covid_rew_filtered, by =c("record_id", "time"))
Categorizes risk and reward variables into ‘Yes’ and ‘No’ groups and then further stratifies them into ‘Low’, ‘Moderate’, and ‘High’ risk categories based on predefined item numbers. Computes the sum and average scores for each group by risk category, enabling detailed analysis of responses within the stratified groups.
Converts mental health survey data from wide to long format for analytical flexibility. It cleans and summarizes the data, mapping each variable to its respective time points defined by suffixes, calculating the number of iterations and value ranges for each variable. This process involves removing redundant rows and establishing a clear structure to facilitate longitudinal analyses, incorporating a ‘time_factor’ column to align data points with specific survey phases.
Please note: the mental health data requires a slightly different approach, and thus, comes with its own unique transformations as a result of each variable having different frequencies of occurrence in the data collection sequence.
Code
# Extract the column namescolumn_names <-names(mentalhealth_survey_data)# Create a list where each base variable name is mapped to its original column names after removing suffixesbase_variable_names <-str_remove_all(column_names, pattern ="_baseline|_two_week|_monthly|_six_week|_second_monthly|_ten_week|_third_monthly")# Create a unique list of base variable namesunique_base_variables <-unique(base_variable_names)# Create a data frame to store the summary of each variablevariable_summary <-data.frame(variable = unique_base_variables)# Calculate the number of iterations and value rangesvariable_summary <- variable_summary %>%rowwise() %>%mutate(iterations =length(grep(paste0("^", variable, "(_|$)"), column_names)),value_range = { cols <-names(mentalhealth_survey_data)[grep(paste0("^", variable, "(_|$)"), column_names)] min_val <-min(unlist(mentalhealth_survey_data[cols], use.names =FALSE), na.rm =TRUE) max_val <-max(unlist(mentalhealth_survey_data[cols], use.names =FALSE), na.rm =TRUE)paste(min_val, max_val, sep ="-") } )# Manual edits to the data frame iteration values to account for variables with similar base namesvariable_summary <- variable_summary %>%mutate(iterations =case_when( variable =="ami"~"7", variable =="bis"~"4", variable =="dsm_anger"~"4", variable =="dsm_anhedonia"~"4", variable =="dsm_anxiety"~"4", variable =="dsm_depression"~"4", variable =="dsm_dissociation"~"4", variable =="dsm_dysphoria"~"4", variable =="dsm_mania"~"4", variable =="dsm_memory"~"4", variable =="dsm_personality"~"4", variable =="dsm_psychosis"~"4", variable =="dsm_repetition"~"4", variable =="dsm_sleep"~"4", variable =="dsm_somatic"~"4", variable =="dsm_suicidality"~"4", variable =="dsm_substance"~"4", variable =="dsm_total"~"4", variable =="mpss"~"4", variable =="panas_positive"~"1", variable =="panas_negative"~"1", variable =="promis_pain"~"2", variable =="promis_physical"~"2",TRUE~as.character(iterations) # Keep existing iterations for other variables ) ) %>%filter(variable !="record_id") # Remove the row where variable is 'record_id'# Define time suffixes and factorstime_suffixes <-list(`7`=c("_baseline", "_two_week", "_monthly", "_six_week", "_second_monthly", "_ten_week", "_third_monthly"),`4`=c("_baseline", "_monthly", "_second_monthly", "_third_monthly"),`2`=c("_baseline", "_two_week"),`1`=c("_baseline"))time_factors <-list(`7`=c("Baseline", "Two Week Follow-Up", "One Month Follow-Up", "Six Week Follow-Up", "Two Month Follow-Up", "Ten Week Follow-Up", "Three Month Follow-Up"),`4`=c("Baseline", "One Month Follow-Up", "Two Month Follow-Up", "Three Month Follow-Up"),`2`=c("Baseline", "Two Week Follow-Up"),`1`=c("Baseline"))# Transforming mentalhealth_survey_data from wide to long format and creating 'time_factor'mentalhealth_survey_data_long <- mentalhealth_survey_data %>%pivot_longer(cols =-record_id, # Assuming 'record_id' is the identifier and should not be pivotednames_to ="variable_time",values_to ="value" ) %>%mutate(time =case_when(str_detect(variable_time, "_baseline") ~0,str_detect(variable_time, "_two_week") ~1,str_detect(variable_time, "_monthly") &!str_detect(variable_time, "second_monthly") &!str_detect(variable_time, "third_monthly") ~2,str_detect(variable_time, "_six_week") ~3,str_detect(variable_time, "_second_monthly") ~4,str_detect(variable_time, "_ten_week") ~5,str_detect(variable_time, "_third_monthly") ~6,TRUE~as.integer(NA) # Handling unexpected cases ),variable =str_remove(variable_time, "_baseline|_two_week|_monthly|_six_week|_second_monthly|_ten_week|_third_monthly"),time_factor =factor(case_when( time ==0~"Baseline", time ==1~"Two Week Follow-Up", time ==2~"One Month Follow-Up", time ==3~"Six Week Follow-Up", time ==4~"Two Month Follow-Up", time ==5~"Ten Week Follow-Up", time ==6~"Three Month Follow-Up",TRUE~NA_character_# Handles any unexpected cases ),levels =c("Baseline", "Two Week Follow-Up", "One Month Follow-Up", "Six Week Follow-Up", "Two Month Follow-Up", "Ten Week Follow-Up", "Three Month Follow-Up") ) ) %>%select(record_id, time, time_factor, variable, value) # Reorder and select only necessary columns# Omit specific time points for DSM variablesdsm_vars <-c("dsm_anger", "dsm_anhedonia", "dsm_anxiety", "dsm_depression", "dsm_dissociation" , "dsm_dysphoria", "dsm_mania", "dsm_memory", "dsm_personality", "dsm_psychosis","dsm_repetition", "dsm_sleep", "dsm_somatic", "dsm_suicidality", "dsm_substance","dsm_total")omit_times <-c("Two Week Follow-Up", "Six Week Follow-Up", "Ten Week Follow-Up")# Filter out unwanted time points for DSM variablesmentalhealth_survey_data_long <- mentalhealth_survey_data_long %>%filter(!(variable %in% dsm_vars & time_factor %in% omit_times))# Keep specific time points for PROMIS variablespromis_vars <-c("promis_pain", "promis_physical")keep_promis_times <-c("Baseline", "Two Week Follow-Up")# Filter and keep specific time points for PROMIS variablesmentalhealth_survey_data_long <- mentalhealth_survey_data_long %>%filter(!(variable %in% promis_vars) | (variable %in% promis_vars & time_factor %in% keep_promis_times))# Keep specific time points for PANAS variablespanas_vars <-c("panas_positive", "panas_negative")keep_panas_times <-"Baseline"# Filter and keep specific time points for PANAS variablesmentalhealth_survey_data_long <- mentalhealth_survey_data_long %>%filter(!(variable %in% panas_vars) | (variable %in% panas_vars & time_factor == keep_panas_times))
Descriptive Statistics & Visualizations
Please use the data frame ‘covid_long_final’ for any descriptive statistics calculations or visualizations pertaining to behavior, risk, &/or reward data. Note: These data frames are already in long format.
Extracts descriptive statistics for behavior, risk, reward, and mental health data across different time points. Outputs are then converted into a consolidated data frame, reformatted for improved readability, and displayed as a multi-tiered HTML table with headers representing each time point.
Code
# Reorder columns in final data frame to verify baseline completion dates matchcovid_long_final <- covid_long_final %>%select(record_id, time, baseline_date_complete, baseline_date_complete.x, baseline_date_complete.y, everything())# Following verification, remove the duplicate columnscovid_long_final <- covid_long_final %>%select(-baseline_date_complete.x, -baseline_date_complete.y)# Add new column assigning group label based on baseline completion date and ensure it's a factorcovid_long_final <- covid_long_final %>%mutate(baseline_date_complete =mdy(baseline_date_complete), # Convert to Date format if not alreadycovid_group =case_when( baseline_date_complete >=mdy("05/01/2020") & baseline_date_complete <=mdy("11/30/2020") ~"first_wave", baseline_date_complete >=mdy("12/01/2020") & baseline_date_complete <=mdy("2/28/2021") ~"second_wave",TRUE~NA_character_# for dates outside the range or NA ),covid_group =factor(covid_group, levels =c("first_wave", "second_wave"))) %>%relocate(covid_group, .after = baseline_date_complete) # Move covid_group right after baseline_date_complete# Function to convert describeBy output to data framedescribe_to_df <-function(describe_obj) {bind_rows(lapply(describe_obj, as.data.frame))}# Generate describeBy outputsdescribe_sum_risk_Y <-describeBy(covid_long_final$sum_risk_Y, group = covid_long_final$time)describe_sum_risk_N <-describeBy(covid_long_final$sum_risk_N, group = covid_long_final$time)describe_avg_risk_Y <-describeBy(covid_long_final$avg_risk_Y, group = covid_long_final$time)describe_avg_risk_N <-describeBy(covid_long_final$avg_risk_N, group = covid_long_final$time)describe_sum_low_risk_Y <-describeBy(covid_long_final$sum_low_risk_Y, group = covid_long_final$time)describe_sum_low_risk_N <-describeBy(covid_long_final$sum_low_risk_N, group = covid_long_final$time)describe_avg_low_risk_Y <-describeBy(covid_long_final$avg_low_risk_Y, group = covid_long_final$time)describe_avg_low_risk_N <-describeBy(covid_long_final$avg_low_risk_N, group = covid_long_final$time)describe_sum_moderate_risk_Y <-describeBy(covid_long_final$sum_moderate_risk_Y, group = covid_long_final$time)describe_sum_moderate_risk_N <-describeBy(covid_long_final$sum_moderate_risk_N, group = covid_long_final$time)describe_avg_moderate_risk_Y <-describeBy(covid_long_final$avg_moderate_risk_Y, group = covid_long_final$time)describe_avg_moderate_risk_N <-describeBy(covid_long_final$avg_moderate_risk_N, group = covid_long_final$time)describe_sum_high_risk_Y <-describeBy(covid_long_final$sum_high_risk_Y, group = covid_long_final$time)describe_sum_high_risk_N <-describeBy(covid_long_final$sum_high_risk_N, group = covid_long_final$time)describe_avg_high_risk_Y <-describeBy(covid_long_final$avg_high_risk_Y, group = covid_long_final$time)describe_avg_high_risk_N <-describeBy(covid_long_final$avg_high_risk_N, group = covid_long_final$time)describe_sum_rew_Y <-describeBy(covid_long_final$sum_rew_Y, group = covid_long_final$time)describe_sum_rew_N <-describeBy(covid_long_final$sum_rew_N, group = covid_long_final$time)describe_avg_rew_Y <-describeBy(covid_long_final$avg_rew_Y, group = covid_long_final$time)describe_avg_rew_N <-describeBy(covid_long_final$avg_rew_N, group = covid_long_final$time)describe_sum_low_rew_Y <-describeBy(covid_long_final$sum_low_rew_Y, group = covid_long_final$time)describe_sum_low_rew_N <-describeBy(covid_long_final$sum_low_rew_N, group = covid_long_final$time)describe_avg_low_rew_Y <-describeBy(covid_long_final$avg_low_rew_Y, group = covid_long_final$time)describe_avg_low_rew_N <-describeBy(covid_long_final$avg_low_rew_N, group = covid_long_final$time)describe_sum_moderate_rew_Y <-describeBy(covid_long_final$sum_moderate_rew_Y, group = covid_long_final$time)describe_sum_moderate_rew_N <-describeBy(covid_long_final$sum_moderate_rew_N, group = covid_long_final$time)describe_avg_moderate_rew_Y <-describeBy(covid_long_final$avg_moderate_rew_Y, group = covid_long_final$time)describe_avg_moderate_rew_N <-describeBy(covid_long_final$avg_moderate_rew_N, group = covid_long_final$time)describe_sum_high_rew_Y <-describeBy(covid_long_final$sum_high_rew_Y, group = covid_long_final$time)describe_sum_high_rew_N <-describeBy(covid_long_final$sum_high_rew_N, group = covid_long_final$time)describe_avg_high_rew_Y <-describeBy(covid_long_final$avg_high_rew_Y, group = covid_long_final$time)describe_avg_high_rew_N <-describeBy(covid_long_final$avg_high_rew_N, group = covid_long_final$time)# Convert each describeBy output to data framedf_sum_risk_Y <-describe_to_df(describe_sum_risk_Y)df_sum_risk_N <-describe_to_df(describe_sum_risk_N)df_avg_risk_Y <-describe_to_df(describe_avg_risk_Y)df_avg_risk_N <-describe_to_df(describe_avg_risk_N)df_sum_low_risk_Y <-describe_to_df(describe_sum_low_risk_Y)df_sum_low_risk_N <-describe_to_df(describe_sum_low_risk_N)df_avg_low_risk_Y <-describe_to_df(describe_avg_low_risk_Y)df_avg_low_risk_N <-describe_to_df(describe_avg_low_risk_N)df_sum_moderate_risk_Y <-describe_to_df(describe_sum_moderate_risk_Y)df_sum_moderate_risk_N <-describe_to_df(describe_sum_moderate_risk_N)df_avg_moderate_risk_Y <-describe_to_df(describe_avg_moderate_risk_Y)df_avg_moderate_risk_N <-describe_to_df(describe_avg_moderate_risk_N)df_sum_high_risk_Y <-describe_to_df(describe_sum_high_risk_Y)df_sum_high_risk_N <-describe_to_df(describe_sum_high_risk_N)df_avg_high_risk_Y <-describe_to_df(describe_avg_high_risk_Y)df_avg_high_risk_N <-describe_to_df(describe_avg_high_risk_N)df_sum_rew_Y <-describe_to_df(describe_sum_rew_Y)df_sum_rew_N <-describe_to_df(describe_sum_rew_N)df_avg_rew_Y <-describe_to_df(describe_avg_rew_Y)df_avg_rew_N <-describe_to_df(describe_avg_rew_N)df_sum_low_rew_Y <-describe_to_df(describe_sum_low_rew_Y)df_sum_low_rew_N <-describe_to_df(describe_sum_low_rew_N)df_avg_low_rew_Y <-describe_to_df(describe_avg_low_rew_Y)df_avg_low_rew_N <-describe_to_df(describe_avg_low_rew_N)df_sum_moderate_rew_Y <-describe_to_df(describe_sum_moderate_rew_Y)df_sum_moderate_rew_N <-describe_to_df(describe_sum_moderate_rew_N)df_avg_moderate_rew_Y <-describe_to_df(describe_avg_moderate_rew_Y)df_avg_moderate_rew_N <-describe_to_df(describe_avg_moderate_rew_N)df_sum_high_rew_Y <-describe_to_df(describe_sum_high_rew_Y)df_sum_high_rew_N <-describe_to_df(describe_sum_high_rew_N)df_avg_high_rew_Y <-describe_to_df(describe_avg_high_rew_Y)df_avg_high_rew_N <-describe_to_df(describe_avg_high_rew_N)# Combine all data frames into onecombined_df <-bind_rows( df_sum_risk_Y %>%mutate(variable ="sum_risk_Y"), df_sum_risk_N %>%mutate(variable ="sum_risk_N"), df_avg_risk_Y %>%mutate(variable ="avg_risk_Y"), df_avg_risk_N %>%mutate(variable ="avg_risk_N"), df_sum_low_risk_Y %>%mutate(variable ="sum_low_risk_Y"), df_sum_low_risk_N %>%mutate(variable ="sum_low_risk_N"), df_avg_low_risk_Y %>%mutate(variable ="avg_low_risk_Y"), df_avg_low_risk_N %>%mutate(variable ="avg_low_risk_N"), df_sum_moderate_risk_Y %>%mutate(variable ="sum_moderate_risk_Y"), df_sum_moderate_risk_N %>%mutate(variable ="sum_moderate_risk_N"), df_avg_moderate_risk_Y %>%mutate(variable ="avg_moderate_risk_Y"), df_avg_moderate_risk_N %>%mutate(variable ="avg_moderate_risk_N"), df_sum_high_risk_Y %>%mutate(variable ="sum_high_risk_Y"), df_sum_high_risk_N %>%mutate(variable ="sum_high_risk_N"), df_avg_high_risk_Y %>%mutate(variable ="avg_high_risk_Y"), df_avg_high_risk_N %>%mutate(variable ="avg_high_risk_N"), df_sum_rew_Y %>%mutate(variable ="sum_rew_Y"), df_sum_rew_N %>%mutate(variable ="sum_rew_N"), df_avg_rew_Y %>%mutate(variable ="avg_rew_Y"), df_avg_rew_N %>%mutate(variable ="avg_rew_N"), df_sum_low_rew_Y %>%mutate(variable ="sum_low_rew_Y"), df_sum_low_rew_N %>%mutate(variable ="sum_low_rew_N"), df_avg_low_rew_Y %>%mutate(variable ="avg_low_rew_Y"), df_avg_low_rew_N %>%mutate(variable ="avg_low_rew_N"), df_sum_moderate_rew_Y %>%mutate(variable ="sum_moderate_rew_Y"), df_sum_moderate_rew_N %>%mutate(variable ="sum_moderate_rew_N"), df_avg_moderate_rew_Y %>%mutate(variable ="avg_moderate_rew_Y"), df_avg_moderate_rew_N %>%mutate(variable ="avg_moderate_rew_N"), df_sum_high_rew_Y %>%mutate(variable ="sum_high_rew_Y"), df_sum_high_rew_N %>%mutate(variable ="sum_high_rew_N"), df_avg_high_rew_Y %>%mutate(variable ="avg_high_rew_Y"), df_avg_high_rew_N %>%mutate(variable ="avg_high_rew_N"))# Rename 'vars' to 'time'combined_df <- combined_df %>% dplyr::rename(time = vars)# Recode the 'time' column to have a sequence from 0 to 6 for each set of variablescombined_df <- combined_df %>% dplyr::mutate(time =rep(0:6, length.out =n()))# Spread the combined data frame for a nicer formatfinal_table <- combined_df %>%select(variable, time, n, mean, sd, median, min, max)# Pivot the data to a wider format with a specific order for statisticswide_df <- final_table %>%pivot_wider(names_from = time, # Assuming 'time' column is correctly set as a factor or numeric alreadyvalues_from =c(n, mean, sd, median, min, max),names_sep ="_"# Creating column names like mean_0, sd_0, etc. ) %>%select(variable,starts_with("n"), starts_with("mean"), starts_with("sd"), starts_with("median"),starts_with("min"), starts_with("max"))# Generate the ordered column names dynamically based on the desired patternstats_order <-c("n", "mean", "sd", "median", "min", "max")time_points <-0:6# Assuming time points are from 0 to 6ordered_column_names <-unlist(lapply(time_points, function(t) paste(stats_order, t, sep ="_")))# Select columns in the desired orderwide_df <- wide_df %>%select(variable, all_of(ordered_column_names))# Now, strip the time suffixes from the column names for display purposes onlyclean_column_names <-rep(stats_order, times =length(time_points))# Rename the columns for displaynames(wide_df)[-1] <- clean_column_names # Excluding the first column which is 'variable'# Create the formatted table with specified header spansformatted_table <-kable(wide_df, "html", escape =FALSE) %>%kable_styling(full_width =FALSE, position ="left") %>%add_header_above(c(" "=1, "Baseline"=length(stats_order), "Two Week Follow-Up"=length(stats_order), "One Month Follow-Up"=length(stats_order), "Six Week Follow-Up"=length(stats_order), "Two Month Follow-Up"=length(stats_order), "Ten Week Follow-Up"=length(stats_order), "Three Month Follow-Up"=length(stats_order))) %>%column_spec(1, bold =TRUE, border_right =TRUE) %>%scroll_box(width ="100%", height ="500px")# Print the formatted tableformatted_table
Baseline
Two Week Follow-Up
One Month Follow-Up
Six Week Follow-Up
Two Month Follow-Up
Ten Week Follow-Up
Three Month Follow-Up
variable
n
mean
sd
median
min
max
n
mean
sd
median
min
max
n
mean
sd
median
min
max
n
mean
sd
median
min
max
n
mean
sd
median
min
max
n
mean
sd
median
min
max
n
mean
sd
median
min
max
sum_risk_Y
1485
274.23030
226.28174
227.00000
0
2444
1018
237.40864
193.01048
197.50000
0
1956.00000
895
227.56313
192.02669
190.00000
0
1850.000
861
229.06156
209.19431
192.00000
0
2806
747
224.81660
196.28454
185.00000
0
1702
726
224.40909
200.20983
183.50000
0
2295
706
227.97309
205.08190
194.00000
0
2545.00000
sum_risk_N
1485
1000.81414
839.38860
934.00000
0
3024
1018
991.19548
829.03130
898.50000
0
3015.00000
895
890.02793
802.51098
761.00000
0
3223.000
861
822.31010
779.99528
662.00000
0
3273
747
789.20080
771.85539
608.00000
0
3187
726
732.05096
740.96750
508.50000
0
3238
706
742.62890
729.84526
572.50000
0
3286.00000
avg_risk_Y
1477
36.73360
18.31518
36.44444
0
100
1016
36.80283
18.46420
36.77083
0
96.33333
892
36.52524
18.20358
36.46429
0
99.625
856
37.08307
19.07084
37.63333
0
100
740
36.93710
18.51180
36.26786
0
100
720
36.03884
18.28349
35.55000
0
100
703
35.60483
18.29127
35.25000
0
100.00000
avg_risk_N
1294
59.61682
21.58662
63.66026
0
100
882
60.10240
21.35542
64.00000
0
100.00000
737
60.81560
20.81322
64.25000
0
100.000
708
59.50904
20.68344
62.98214
0
100
603
59.71899
20.73033
62.57692
0
100
582
59.18311
21.25296
63.00000
0
100
579
58.28659
20.71050
61.23529
0
100.00000
sum_low_risk_Y
1485
115.84983
74.44552
108.00000
0
480
1018
112.51081
72.42132
105.00000
0
477.00000
895
109.46369
74.00867
100.00000
0
497.000
861
112.24274
76.20841
100.00000
0
635
747
109.58635
75.76697
100.00000
0
500
726
106.90083
73.09543
95.00000
0
523
706
107.61331
74.44842
96.50000
0
578.00000
sum_low_risk_N
1485
71.12458
85.35081
42.00000
0
485
1018
70.75442
83.67033
43.50000
0
490.00000
895
62.70168
81.42799
28.00000
0
501.000
861
61.60046
83.28922
26.00000
0
414
747
60.53414
85.17322
25.00000
0
487
726
54.14050
78.67375
20.50000
0
566
706
53.03824
75.87894
24.00000
0
445.00000
avg_low_risk_Y
1470
30.61031
18.16080
29.00000
0
100
1012
31.43378
18.16988
30.40000
0
96.50000
889
30.97208
18.13344
29.75000
0
100.000
852
32.52859
19.06190
30.90000
0
100
739
32.05328
18.61163
31.33333
0
100
718
31.45945
18.36048
29.58333
0
100
699
30.75851
18.42192
29.50000
0
100.00000
avg_low_risk_N
1093
31.68997
21.64997
29.33333
0
100
721
33.10412
21.31875
31.42857
0
100.00000
601
32.95518
21.99320
30.00000
0
100.000
552
33.92596
21.23626
32.00000
0
96
479
34.56290
22.24221
32.00000
0
100
451
34.42303
22.23452
32.00000
0
100
446
33.45676
21.39004
31.50000
0
100.00000
sum_moderate_risk_Y
1485
97.74411
110.16919
68.00000
0
1040
1018
75.32024
91.94670
50.00000
0
725.00000
895
72.56648
89.56590
50.00000
0
700.000
861
70.15215
93.06110
50.00000
0
1025
747
69.13922
91.70351
50.00000
0
800
726
69.69835
93.78388
50.00000
0
991
706
69.74788
90.68383
50.00000
0
934.00000
sum_moderate_risk_N
1485
378.58249
331.06133
350.00000
0
1272
1018
377.79175
331.78267
348.50000
0
1292.00000
895
336.00000
317.36767
270.00000
0
1373.000
861
313.29268
307.91660
243.00000
0
1400
747
302.49398
306.97741
227.00000
0
1345
726
275.56474
290.90763
178.50000
0
1355
706
284.75354
292.56185
210.00000
0
1344.00000
avg_moderate_risk_Y
1208
44.36928
23.00680
46.58333
0
100
764
43.93974
23.38548
46.58333
0
100.00000
664
45.08922
23.43286
47.00000
0
100.000
624
44.65850
23.38186
50.00000
0
100
535
44.20987
23.51523
46.75000
0
100
520
42.89716
22.91906
43.25000
0
100
531
42.05401
22.84972
43.50000
0
100.00000
avg_moderate_risk_N
1145
57.57800
20.66400
59.90000
0
100
781
58.15952
20.01155
60.50000
0
100.00000
662
58.01828
20.15197
60.30303
0
100.000
624
57.88082
19.80410
59.87500
0
100
529
57.69474
19.90693
59.88889
0
100
498
56.71318
20.18615
59.31667
0
100
505
55.34864
20.37588
57.14286
0
100.00000
sum_high_risk_Y
1485
60.63636
101.33752
22.00000
0
1202
1018
49.57760
86.36266
0.00000
0
1253.00000
895
45.53296
82.13051
0.00000
0
1050.000
861
46.66667
89.80994
0.00000
0
1146
747
46.09103
81.16342
0.00000
0
908
726
47.80992
87.65141
9.50000
0
1186
706
50.61190
90.40760
15.50000
0
1033.00000
sum_high_risk_N
1485
551.10707
467.38918
500.00000
0
1500
1018
542.64931
457.50745
489.50000
0
1500.00000
895
491.32626
443.05528
401.00000
0
1499.000
861
447.41696
428.28606
353.00000
0
1500
747
426.17269
421.70884
306.00000
0
1494
726
402.34573
413.47890
269.50000
0
1486
706
404.83711
404.64012
282.00000
0
1500.00000
avg_high_risk_Y
860
48.42007
25.44029
50.00000
0
100
526
50.22572
25.77831
50.00000
0
100.00000
440
49.45865
25.92063
50.00000
0
100.000
429
48.20887
25.36797
50.00000
0
100
379
47.10498
25.71907
50.00000
0
100
384
47.95643
24.95251
50.00000
0
100
386
47.47601
24.21687
50.00000
0
100.00000
avg_high_risk_N
1220
71.42816
22.65419
77.65476
0
100
820
73.02766
20.17925
77.42262
0
100.00000
690
72.85361
20.21446
76.60769
0
100.000
652
71.69236
20.76560
75.90417
0
100
558
71.36472
21.27820
77.00000
0
100
532
70.46809
21.70508
76.20202
0
100
522
69.95070
20.66678
75.00000
0
100.00000
sum_rew_Y
1485
417.17643
309.38861
344.00000
0
2917
1018
354.51277
261.31355
294.00000
0
2072.00000
895
346.74078
255.73152
294.00000
0
2101.000
861
334.40418
265.38334
280.00000
0
2861
747
332.93708
255.25434
279.00000
0
2000
726
345.73416
258.41634
291.00000
0
1900
706
363.89660
278.05688
303.50000
0
2495.00000
sum_rew_N
1485
818.99394
652.85004
801.00000
0
3000
1018
808.34872
654.12223
769.00000
0
2974.00000
895
743.37095
652.22608
669.00000
0
2642.000
861
704.28339
637.62167
637.00000
0
2591
747
668.05622
631.38072
547.00000
0
2846
726
636.16529
617.65120
484.00000
0
3063
706
662.05382
621.64123
530.00000
0
3097.00000
avg_rew_Y
1477
52.59260
17.53307
53.00000
0
100
1016
52.52020
17.84096
52.73214
0
100.00000
892
53.09423
18.55131
54.41667
0
100.000
856
51.93917
18.31702
52.85714
0
100
740
52.50050
18.50441
52.87302
0
100
720
53.30946
18.25996
53.73214
0
100
703
53.84147
18.06145
55.28571
0
99.91667
avg_rew_N
1294
51.63240
19.50755
52.24444
0
100
882
53.22360
20.10460
54.57419
0
100.00000
737
53.14159
19.79110
53.12500
0
100.000
708
54.42282
19.66078
53.94074
0
100
603
53.28601
20.00794
54.25000
0
100
582
54.92480
20.11108
54.48693
0
100
579
55.44498
19.76837
56.53333
0
100.00000
sum_low_rew_Y
1485
186.58855
94.92209
185.00000
0
584
1018
177.22986
92.02859
171.00000
0
471.00000
895
176.95084
93.95841
172.00000
0
500.000
861
170.57724
93.32347
165.00000
0
608
747
168.36546
95.91654
161.00000
0
550
726
172.25069
96.64543
164.00000
0
540
706
177.13031
93.59152
176.00000
0
593.00000
sum_low_rew_N
1485
103.73939
97.08647
89.00000
0
499
1018
102.99902
99.28965
89.00000
0
521.00000
895
91.96983
96.29504
74.00000
0
460.000
861
86.50290
96.75389
55.00000
0
474
747
83.38688
94.52931
56.00000
0
443
726
76.81405
90.86771
50.00000
0
505
706
79.29320
92.03157
55.00000
0
649.00000
avg_low_rew_Y
1470
47.46660
18.41246
48.63333
0
100
1012
48.26785
18.65756
49.90000
0
100.00000
889
48.94311
19.39125
50.00000
0
100.000
852
48.08872
19.16633
49.70833
0
100
739
48.49829
19.61240
50.00000
0
100
718
48.70778
19.21301
50.00000
0
100
699
49.25140
18.70270
50.00000
0
100.00000
avg_low_rew_N
1093
50.57158
23.47167
50.00000
0
100
721
52.34495
23.44032
50.00000
0
100.00000
601
51.96558
23.47094
50.00000
0
100.000
552
51.31838
23.42469
50.00000
0
100
479
52.61574
23.68327
50.00000
0
100
451
52.34428
24.64108
50.00000
0
100
446
53.73931
24.39009
50.50000
0
100.00000
sum_moderate_rew_Y
1485
139.56498
156.70092
96.00000
0
1300
1018
103.80354
129.24249
74.50000
0
1100.00000
895
100.68603
124.45433
69.00000
0
900.000
861
93.62253
121.10001
63.00000
0
1218
747
92.88086
122.66730
60.00000
0
1100
726
100.26309
124.30940
71.50000
0
1000
706
104.00708
134.66831
67.00000
0
1201.00000
sum_moderate_rew_N
1485
338.56094
279.05377
331.00000
0
1200
1018
332.77210
281.30000
313.50000
0
1285.00000
895
306.71285
279.63331
273.00000
0
1137.000
861
293.54472
273.61377
265.00000
0
1166
747
278.57430
269.34730
230.00000
0
1132
726
266.02893
269.65763
196.00000
0
1248
706
280.77195
273.18885
224.00000
0
1252.00000
avg_moderate_rew_Y
1208
56.33394
24.69142
59.16667
0
100
765
55.23049
25.65646
56.25000
0
100.00000
664
57.47821
26.85768
60.70833
0
100.000
624
54.98012
25.46704
55.50000
0
100
535
54.89870
25.59349
57.00000
0
100
520
58.57071
24.95060
59.50000
0
100
531
56.92545
25.33919
60.00000
0
100.00000
avg_moderate_rew_N
1145
53.49429
19.10508
53.83333
0
100
781
53.76948
19.41197
54.60000
0
100.00000
662
53.85156
19.17972
55.00000
0
100.000
624
55.12790
18.77903
55.14835
0
100
529
54.55982
19.32868
55.00000
0
100
498
55.85298
19.83728
56.74603
0
100
505
56.16582
20.09125
58.18182
0
100.00000
sum_high_rew_Y
1485
91.02290
129.12331
50.00000
0
1271
1018
73.47937
112.77948
20.00000
0
1128.00000
895
69.10391
105.89205
0.00000
0
859.000
861
70.20441
115.35125
0.00000
0
1128
747
71.69076
105.32182
9.00000
0
719
726
73.22039
105.65961
50.00000
0
863
706
82.75921
119.34784
50.00000
0
1019.00000
sum_high_rew_N
1485
376.69360
324.46414
325.00000
0
1400
1018
372.57760
320.86933
344.00000
0
1241.00000
895
344.68827
317.70390
294.00000
0
1271.000
861
324.23577
310.82210
255.00000
0
1383
747
306.09505
308.19791
216.00000
0
1351
726
293.32231
298.35440
195.00000
0
1344
706
301.98867
301.31752
200.00000
0
1362.00000
avg_high_rew_Y
860
70.16686
20.85981
73.50000
0
100
526
70.62167
22.35507
74.00000
0
100.00000
440
71.79139
21.91000
75.00000
0
100.000
429
70.08123
22.06468
75.00000
0
100
379
71.22157
21.67599
75.00000
0
100
384
71.86266
20.56741
75.00000
0
100
386
74.82674
20.54987
78.46429
0
100.00000
avg_high_rew_N
1220
51.44062
22.49346
52.00000
0
100
820
53.23718
22.46627
54.65476
0
100.00000
690
54.05226
23.21990
54.47222
0
100.000
652
55.36304
22.35525
55.10238
0
100
558
54.86194
23.09543
57.20000
0
100
532
55.77056
22.13105
56.57500
0
100
522
56.01724
21.97305
57.45833
0
100.00000
Code
# Map from time factors to numerical valuestime_mapping <-c('Baseline'=0,'Two Week Follow-Up'=1,'One Month Follow-Up'=2,'Six Week Follow-Up'=3,'Two Month Follow-Up'=4,'Ten Week Follow-Up'=5,'Three Month Follow-Up'=6)# Store results in a listsummary_list <-list()# Loop over each variable in the variable_summary data framefor (i in1:nrow(variable_summary)) { variable_name <- variable_summary$variable[i] iterations <-as.character(variable_summary$iterations[i])# Get the appropriate suffixes and time factors for the current number of iterations suffixes <- time_suffixes[[iterations]] factors <- time_factors[[iterations]]# Create a data frame to store summaries for the current variable variable_data <-list()# Calculate statistics for each suffixfor (j inseq_along(suffixes)) { full_var_name <-paste0(variable_name, suffixes[j])# Check if the column exists in the data, to avoid errorsif (full_var_name %in%names(mentalhealth_survey_data)) { temp_data <- mentalhealth_survey_data %>% dplyr::summarise(score_mean =mean(.data[[full_var_name]], na.rm =TRUE),n =sum(!is.na(.data[[full_var_name]])), # Sample size excluding NAssem =sd(.data[[full_var_name]], na.rm =TRUE) /sqrt(sum(!is.na(.data[[full_var_name]]))),.groups ='drop' ) %>%mutate(time_factor = factors[j],time = time_mapping[[factors[j]]] # Assign numerical time based on factor ) variable_data[[j]] <- temp_data } }# Combine all time point data into one data frame for the current variable variable_data <-bind_rows(variable_data) summary_list[[variable_name]] <- variable_data}# Optionally, if you want to save each data frame separately or operate furtherfor (name innames(summary_list)) {assign(paste0("filtered_", name, "_summary"), summary_list[[name]], envir = .GlobalEnv)}
Visual representations are crafted for average risk and reward scores using box plots and density ridge plots, segregated by time factors. This section meticulously prepares the data descriptives, culminating in a series of raincloud plots that highlight temporal variations in the dataset.
Code
# Prepare the time factorcovid_long_final <- covid_long_final %>%mutate(time_factor =factor(time, levels =0:6, labels =c("Baseline", "Two Week Follow-Up", "One Month Follow-Up", "Six Week Follow-Up", "Two Month Follow-Up", "Ten Week Follow-Up", "Three Month Follow-Up" )) )# Creating the density ridge plot for average 'yes' risk scoresp1 <-ggplot(covid_long_final, aes(x = avg_risk_Y, y = time_factor, fill = time_factor)) + ggridges::geom_density_ridges(alpha =0.5, jittered_points =TRUE, point_alpha =0.5, size =0.25, point_shape =20, aes(color = time_factor)) +labs(x ="Avg 'Yes' Risk Score", y ='') +guides(fill =FALSE, color =FALSE) +theme_minimal() +xlim(0, 100)# Creating the density ridge plot for average 'no' risk scoresp2 <-ggplot(covid_long_final, aes(x = avg_risk_N, y = time_factor, fill = time_factor)) + ggridges::geom_density_ridges(alpha =0.5, jittered_points =TRUE, point_alpha =0.5, size =0.25, point_shape =20, aes(color = time_factor)) +labs(x ="Avg 'No' Risk Score", y ='') +guides(fill =FALSE, color =FALSE) +theme_minimal() +xlim(0, 100)# Display the density plotsgrid.arrange(p1, p2, ncol =2)
Code
# Creating the density ridge plot for average 'yes' reward scoresp3 <-ggplot(covid_long_final, aes(x = avg_rew_Y, y = time_factor, fill = time_factor)) + ggridges::geom_density_ridges(alpha =0.5, jittered_points =TRUE, point_alpha =0.5, size =0.25, point_shape =20, aes(color = time_factor)) +labs(x ="Avg 'Yes' Reward Score", y ='') +guides(fill =FALSE, color =FALSE) +theme_minimal() +xlim(0, 100)# Creating the density ridge plot for average 'no' reward scoresp4 <-ggplot(covid_long_final, aes(x = avg_rew_N, y = time_factor, fill = time_factor)) + ggridges::geom_density_ridges(alpha =0.5, jittered_points =TRUE, point_alpha =0.5, size =0.25, point_shape =20, aes(color = time_factor)) +labs(x ="Avg 'No' Reward Score", y ='') +guides(fill =FALSE, color =FALSE) +theme_minimal() +xlim(0, 100)# Display the density plotsgrid.arrange(p3, p4, ncol =2)
Code
# Create the summary avg risk 'Y' data framefiltered_AvgRiskY_summary <- covid_long_final %>%group_by(time_factor) %>% dplyr::summarise(score_mean =mean(avg_risk_Y, na.rm =TRUE),n =n(), # Sample size for each groupsem =sd(avg_risk_Y, na.rm =TRUE) /sqrt(n()),.groups ='drop' )# Create the summary avg risk 'N' data framefiltered_AvgRiskN_summary <- covid_long_final %>%group_by(time_factor) %>% dplyr::summarise(score_mean =mean(avg_risk_N, na.rm =TRUE),n =n(), # Sample size for each groupsem =sd(avg_risk_N, na.rm =TRUE) /sqrt(n()),.groups ='drop' )# Create the summary avg reward 'Y' data framefiltered_AvgRewY_summary <- covid_long_final %>%group_by(time_factor) %>% dplyr::summarise(score_mean =mean(avg_rew_Y, na.rm =TRUE),n =n(), # Sample size for each groupsem =sd(avg_rew_Y, na.rm =TRUE) /sqrt(n()),.groups ='drop' )# Create the summary avg reward 'N' data framefiltered_AvgRewN_summary <- covid_long_final %>%group_by(time_factor) %>% dplyr::summarise(score_mean =mean(avg_rew_N, na.rm =TRUE),n =n(), # Sample size for each groupsem =sd(avg_rew_N, na.rm =TRUE) /sqrt(n()),.groups ='drop' )# Create a avg 'yes' risk raincloud plotp5 <-ggplot(covid_long_final, aes(x = time_factor, y = avg_risk_Y, fill = time_factor, color = time_factor)) +geom_flat_violin(adjust =1.5, trim =FALSE, alpha =0.5, position =position_nudge(x =0.2, y =0), colour =NA) +geom_point(aes(x =as.numeric(time_factor)-0.25, y = avg_risk_Y, colour = time_factor), position =position_jitter(width = .05), size = .5, shape =20) +geom_boxplot(outlier.shape =NA, alpha =0.5, width =0.25, position =position_dodge(width =0.3), colour ="black") +geom_point(data = filtered_AvgRiskY_summary, aes(x =factor(time_factor), y = score_mean), shape =18, position =position_nudge(x =0.2)) +geom_errorbar(data = filtered_AvgRiskY_summary, aes(x =factor(time_factor), y = score_mean, ymin = score_mean - sem, ymax = score_mean + sem), width =0.05, position =position_nudge(x =0.2)) +labs(title ="Avg 'Yes' Risk Scores by Time Point", y ="Score", x ="Time Point", fill ="Time Point", # Legend title for fillcolor ="Time Point"# Legend title for color ) +theme_minimal() +theme(legend.position ="right",legend.title =element_text(face ="bold"), # Make legend title boldaxis.text.x =element_text(angle =45, hjust =1) # Rotate x-axis labels by 45 degrees ) +ylim(-5, 105)print(p5)
Code
# Create a avg 'no' risk raincloud plotp6 <-ggplot(covid_long_final, aes(x = time_factor, y = avg_risk_N, fill = time_factor, color = time_factor)) +geom_flat_violin(adjust =1.5, trim =FALSE, alpha =0.5, position =position_nudge(x =0.2, y =0), colour =NA) +geom_point(aes(x =as.numeric(time_factor)-0.25, y = avg_risk_N, colour = time_factor), position =position_jitter(width = .05), size = .5, shape =20) +geom_boxplot(outlier.shape =NA, alpha =0.5, width =0.25, position =position_dodge(width =0.3), colour ="black") +geom_point(data = filtered_AvgRiskN_summary, aes(x =factor(time_factor), y = score_mean), shape =18, position =position_nudge(x =0.2)) +geom_errorbar(data = filtered_AvgRiskN_summary, aes(x =factor(time_factor), y = score_mean, ymin = score_mean - sem, ymax = score_mean + sem), width =0.05, position =position_nudge(x =0.2)) +labs(title ="Avg 'No' Risk Scores by Time Point", y ="Score", x ="Time Point", fill ="Time Point", # Legend title for fillcolor ="Time Point"# Legend title for color ) +theme_minimal() +theme(legend.position ="right",legend.title =element_text(face ="bold"), # Make legend title boldaxis.text.x =element_text(angle =45, hjust =1) # Rotate x-axis labels by 45 degrees ) +ylim(-5, 105)print(p6)
Code
# Combine raincloud plots and place the legend on the rightcombined_plot56 <- p5 + p6 +plot_layout(guides ='collect') &theme(legend.position ='right')# Print the combined plotprint(combined_plot56)
Code
# Create a avg 'yes' reward raincloud plotp7 <-ggplot(covid_long_final, aes(x = time_factor, y = avg_rew_Y, fill = time_factor, color = time_factor)) +geom_flat_violin(adjust =1.5, trim =FALSE, alpha =0.5, position =position_nudge(x =0.2, y =0), colour =NA) +geom_point(aes(x =as.numeric(time_factor)-0.25, y = avg_rew_Y, colour = time_factor), position =position_jitter(width = .05), size = .5, shape =20) +geom_boxplot(outlier.shape =NA, alpha =0.5, width =0.25, position =position_dodge(width =0.3), colour ="black") +geom_point(data = filtered_AvgRewY_summary, aes(x =factor(time_factor), y = score_mean), shape =18, position =position_nudge(x =0.2)) +geom_errorbar(data = filtered_AvgRewY_summary, aes(x =factor(time_factor), y = score_mean, ymin = score_mean - sem, ymax = score_mean + sem), width =0.05, position =position_nudge(x =0.2)) +labs(title ="Avg 'Yes' Reward Scores by Time Point", y ="Score", x ="Time Point", fill ="Time Point", # Legend title for fillcolor ="Time Point"# Legend title for color ) +theme_minimal() +theme(legend.position ="right",legend.title =element_text(face ="bold"), # Make legend title boldaxis.text.x =element_text(angle =45, hjust =1) # Rotate x-axis labels by 45 degrees ) +ylim(-5, 105)print(p7)
Code
# Create a avg 'no' reward raincloud plotp8 <-ggplot(covid_long_final, aes(x = time_factor, y = avg_rew_N, fill = time_factor, color = time_factor)) +geom_flat_violin(adjust =1.5, trim =FALSE, alpha =0.5, position =position_nudge(x =0.2, y =0), colour =NA) +geom_point(aes(x =as.numeric(time_factor)-0.25, y = avg_rew_N, colour = time_factor), position =position_jitter(width = .05), size = .5, shape =20) +geom_boxplot(outlier.shape =NA, alpha =0.5, width =0.25, position =position_dodge(width =0.3), colour ="black") +geom_point(data = filtered_AvgRewN_summary, aes(x =factor(time_factor), y = score_mean), shape =18, position =position_nudge(x =0.2)) +geom_errorbar(data = filtered_AvgRewN_summary, aes(x =factor(time_factor), y = score_mean, ymin = score_mean - sem, ymax = score_mean + sem), width =0.05, position =position_nudge(x =0.2)) +labs(title ="Avg 'No' Reward Scores by Time Point", y ="Score", x ="Time Point", fill ="Time Point", # Legend title for fillcolor ="Time Point"# Legend title for color ) +theme_minimal() +theme(legend.position ="right",legend.title =element_text(face ="bold"), # Make legend title boldaxis.text.x =element_text(angle =45, hjust =1) # Rotate x-axis labels by 45 degrees ) +ylim(-5, 105)print(p8)
Code
# Combine raincloud plots and place the legend on the rightcombined_plot78 <- p7 + p8 +plot_layout(guides ='collect') &theme(legend.position ='right')# Print the combined plotprint(combined_plot78)
Code
### Repeating steps above but stratifying by both time points and epidemiological waves# Create the summary avg risk 'Y' data framefiltered_AvgRiskY_waves <- covid_long_final %>%group_by(time_factor, covid_group) %>% dplyr::summarise(score_mean =mean(avg_risk_Y, na.rm =TRUE),n =n(), # Sample size for each groupsem =sd(avg_risk_Y, na.rm =TRUE) /sqrt(n()),.groups ='drop' )# Create the summary avg risk 'N' data framefiltered_AvgRiskN_waves <- covid_long_final %>%group_by(time_factor, covid_group) %>% dplyr::summarise(score_mean =mean(avg_risk_N, na.rm =TRUE),n =n(), # Sample size for each groupsem =sd(avg_risk_N, na.rm =TRUE) /sqrt(n()),.groups ='drop' )# Create the summary avg reward 'Y' data framefiltered_AvgRewY_waves <- covid_long_final %>%group_by(time_factor, covid_group) %>% dplyr::summarise(score_mean =mean(avg_rew_Y, na.rm =TRUE),n =n(), # Sample size for each groupsem =sd(avg_rew_Y, na.rm =TRUE) /sqrt(n()),.groups ='drop' )# Create the summary avg reward 'N' data framefiltered_AvgRewN_waves <- covid_long_final %>%group_by(time_factor, covid_group) %>% dplyr::summarise(score_mean =mean(avg_rew_N, na.rm =TRUE),n =n(), # Sample size for each groupsem =sd(avg_rew_N, na.rm =TRUE) /sqrt(n()),.groups ='drop' )# Create a avg 'yes' risk raincloud plotggplot(covid_long_final, aes(x = time_factor, y = avg_risk_Y, fill = covid_group, color = covid_group)) +geom_flat_violin(adjust =1.5, trim =FALSE, alpha =0.5, position =position_nudge(x =0.2, y =0), colour =NA) +geom_point(aes(x =as.numeric(time_factor)-0.25, y = avg_risk_Y, colour = covid_group), position =position_jitter(width = .05), size = .5, shape =20) +geom_boxplot(outlier.shape =NA, alpha =0.5, width =0.25, position =position_dodge(width =0.3), colour ="black") +geom_line(data = filtered_AvgRiskY_waves, aes(x =factor(time_factor), y = score_mean, group = covid_group, colour = covid_group), linetype =2, position =position_nudge(x =0.2)) +geom_point(data = filtered_AvgRiskY_waves, aes(x =factor(time_factor), y = score_mean), shape =18, position =position_nudge(x =0.2)) +geom_errorbar(data = filtered_AvgRiskY_waves, aes(x =factor(time_factor), y = score_mean, ymin = score_mean - sem, ymax = score_mean + sem), width =0.05, position =position_nudge(x =0.2)) +labs(title ="Avg 'Yes' Risk Scores by Time Point & Wave", y ="Score", x ="Time Point", fill ="COVID Wave", # Legend title for fillcolor ="COVID Wave"# Legend title for color ) +theme_minimal() +theme(legend.position ="right",legend.title =element_text(face ="bold"), # Make legend title boldaxis.text.x =element_text(angle =45, hjust =1) # Rotate x-axis labels by 45 degrees ) +ylim(-5, 105)
Code
# Create a avg 'no' risk raincloud plotggplot(covid_long_final, aes(x = time_factor, y = avg_risk_N, fill = covid_group, color = covid_group)) +geom_flat_violin(adjust =1.5, trim =FALSE, alpha =0.5, position =position_nudge(x =0.2, y =0), colour =NA) +geom_point(aes(x =as.numeric(time_factor)-0.25, y = avg_risk_N, colour = covid_group), position =position_jitter(width = .05), size = .5, shape =20) +geom_boxplot(outlier.shape =NA, alpha =0.5, width =0.25, position =position_dodge(width =0.3), colour ="black") +geom_line(data = filtered_AvgRiskN_waves, aes(x =factor(time_factor), y = score_mean, group = covid_group, colour = covid_group), linetype =2, position =position_nudge(x =0.2)) +geom_point(data = filtered_AvgRiskN_waves, aes(x =factor(time_factor), y = score_mean), shape =18, position =position_nudge(x =0.2)) +geom_errorbar(data = filtered_AvgRiskN_waves, aes(x =factor(time_factor), y = score_mean, ymin = score_mean - sem, ymax = score_mean + sem), width =0.05, position =position_nudge(x =0.2)) +labs(title ="Avg 'No' Risk Scores by Time Point & Wave", y ="Score", x ="Time Point", fill ="COVID Wave", # Legend title for fillcolor ="COVID Wave"# Legend title for color ) +theme_minimal() +theme(legend.position ="right",legend.title =element_text(face ="bold"), # Make legend title boldaxis.text.x =element_text(angle =45, hjust =1) # Rotate x-axis labels by 45 degrees ) +ylim(-5, 105)
Code
# Create a avg 'yes' reward raincloud plotggplot(covid_long_final, aes(x = time_factor, y = avg_rew_Y, fill = covid_group, color = covid_group)) +geom_flat_violin(adjust =1.5, trim =FALSE, alpha =0.5, position =position_nudge(x =0.2, y =0), colour =NA) +geom_point(aes(x =as.numeric(time_factor)-0.25, y = avg_rew_Y, colour = covid_group), position =position_jitter(width = .05), size = .5, shape =20) +geom_boxplot(outlier.shape =NA, alpha =0.5, width =0.25, position =position_dodge(width =0.3), colour ="black") +geom_line(data = filtered_AvgRewY_waves, aes(x =factor(time_factor), y = score_mean, group = covid_group, colour = covid_group), linetype =2, position =position_nudge(x =0.2)) +geom_point(data = filtered_AvgRewY_waves, aes(x =factor(time_factor), y = score_mean), shape =18, position =position_nudge(x =0.2)) +geom_errorbar(data = filtered_AvgRewY_waves, aes(x =factor(time_factor), y = score_mean, ymin = score_mean - sem, ymax = score_mean + sem), width =0.05, position =position_nudge(x =0.2)) +labs(title ="Avg 'Yes' Reward Scores by Time Point & Wave", y ="Score", x ="Time Point", fill ="COVID Wave", # Legend title for fillcolor ="COVID Wave"# Legend title for color ) +theme_minimal() +theme(legend.position ="right",legend.title =element_text(face ="bold"), # Make legend title boldaxis.text.x =element_text(angle =45, hjust =1) # Rotate x-axis labels by 45 degrees ) +ylim(-5, 105)
Code
# Create a avg 'no' reward raincloud plotggplot(covid_long_final, aes(x = time_factor, y = avg_rew_N, fill = covid_group, color = covid_group)) +geom_flat_violin(adjust =1.5, trim =FALSE, alpha =0.5, position =position_nudge(x =0.2, y =0), colour =NA) +geom_point(aes(x =as.numeric(time_factor)-0.25, y = avg_rew_N, colour = covid_group), position =position_jitter(width = .05), size = .5, shape =20) +geom_boxplot(outlier.shape =NA, alpha =0.5, width =0.25, position =position_dodge(width =0.3), colour ="black") +geom_line(data = filtered_AvgRewN_waves, aes(x =factor(time_factor), y = score_mean, group = covid_group, colour = covid_group), linetype =2, position =position_nudge(x =0.2)) +geom_point(data = filtered_AvgRewN_waves, aes(x =factor(time_factor), y = score_mean), shape =18, position =position_nudge(x =0.2)) +geom_errorbar(data = filtered_AvgRewN_waves, aes(x =factor(time_factor), y = score_mean, ymin = score_mean - sem, ymax = score_mean + sem), width =0.05, position =position_nudge(x =0.2)) +labs(title ="Avg 'No' Reward Scores by Time Point & Wave", y ="Score", x ="Time Point", fill ="COVID Wave", # Legend title for fillcolor ="COVID Wave"# Legend title for color ) +theme_minimal() +theme(legend.position ="right",legend.title =element_text(face ="bold"), # Make legend title boldaxis.text.x =element_text(angle =45, hjust =1) # Rotate x-axis labels by 45 degrees ) +ylim(-5, 105)
Code
# Count unique subjects in each category of 'covid_group'covid_wave_counts <- covid_long_final %>%distinct(record_id, covid_group) %>%count(covid_group)# Generate bar plot showing number of subjects in each waveggplot(covid_wave_counts, aes(x = covid_group, y = n, fill = covid_group)) +geom_bar(stat ="identity") +geom_text(aes(label = n), vjust =-0.3, size =3.5) +# Add labels above barslabs(title ="Count of Subjects per COVID Wave",x ="COVID Wave",y ="Number of Subjects") +theme_minimal() +theme(legend.position ="none") # Remove legend if not needed
Code
# Extract month-year and count distinct subjectsmonthly_baseline <- covid_long_final %>%distinct(record_id, baseline_date_complete) %>%mutate(month_year =format(baseline_date_complete, "%Y-%m")) %>%filter(baseline_date_complete >=as.Date("2020-05-01") & baseline_date_complete <=as.Date("2021-02-28")) %>%count(month_year)# Plot the number of subjects completing baseline in each month-yearggplot(monthly_baseline, aes(x = month_year, y = n, fill = month_year)) +geom_bar(stat ="identity") +geom_text(aes(label = n), vjust =-0.5, size =3.5) +# Adding count labels above the barslabs(title ="Monthly Count of Subjects Completing Baseline Session",x ="Month-Year",y ="Number of Subjects") +theme_minimal() +theme(axis.text.x =element_text(angle =45, hjust =1), # Rotate x-axis labels for better readabilitylegend.position ="none") # Remove the legend
Code
# Loop over each unique variable in the 'variable' columnfor (variable inunique(mentalhealth_survey_data_long$variable)) {# Filter the data for the current variable data_filtered <- mentalhealth_survey_data_long %>%filter(variable ==!!variable)# Assuming you have summary data for each variable stored in a way that can be accessed like this: summary_data <-get(paste0("filtered_", variable, "_summary"))# Ensure time_factor is ordered correctly summary_data$time_factor <-factor(summary_data$time_factor, levels =c("Baseline", "Two Week Follow-Up", "One Month Follow-Up", "Six Week Follow-Up", "Two Month Follow-Up", "Ten Week Follow-Up", "Three Month Follow-Up" ))# Create the plot p <-ggplot(data_filtered, aes(x = time_factor, y = value, fill = time_factor, color = time_factor)) +geom_flat_violin(aes(fill = time_factor), adjust =1.5, trim =FALSE, alpha =0.5, position =position_nudge(x =0.2, y =0), colour =NA) +geom_point(aes(fill = time_factor, color = time_factor), position =position_jitter(width = .05), size = .5, shape =20) +geom_boxplot(aes(fill = time_factor, colour = time_factor), outlier.shape =NA, alpha =0.5, width =0.15, position =position_nudge(x =-0.2, y =0), colour ="black") +geom_point(data = summary_data, aes(x =factor(time_factor), y = score_mean, fill = time_factor, color = time_factor), shape =18, position =position_nudge(x =0.2)) +geom_errorbar(data = summary_data, aes(x =factor(time_factor), y = score_mean, ymin = score_mean - sem, ymax = score_mean + sem, fill = time_factor, colour = time_factor), width =0.05, position =position_nudge(x =0.2)) +geom_line(data = summary_data, aes(x =factor(time_factor), y = score_mean), group =1, colour ="black", size =0.5, linetype ="dashed", position =position_nudge(x =0.2)) +# modified linelabs(title =paste(variable, "Scores by Time Point"), y ="Score", x ="Time Point", fill ="Time Point", # Legend title for fillcolor ="Time Point"# Legend title for color ) +theme_minimal() +theme(legend.position ="right",legend.title =element_text(face ="bold"), # Make legend title boldaxis.text.x =element_text(angle =45, hjust =1) # Rotate x-axis labels by 45 degrees )# Print the plotprint(p)}
Random intercept models are established to analyze various behavioral, risk, and reward metrics. Results are displayed in formatted tables that highlight both fixed and random effects across different behavior sums and averages.
Code
# Rename 'record_id' to 'id' in the dataframe 'covid_long_final'covid_long_final <- covid_long_final %>% dplyr::rename(id = record_id)install.packages("Matrix")
The downloaded binary packages are in
/var/folders/c5/99nztygj2m5_63rhvx0ssc9c0000gn/T//RtmpvLz4ci/downloaded_packages
Code
library(Matrix)# Random intercept modelsmodel_random_intercept_num_beh_Y_sum <-lmer(yes_counts ~ time + (1|id), data = covid_long_final)model_random_intercept_num_beh_N_sum <-lmer(no_counts ~ time + (1|id), data = covid_long_final)model_random_intercept_avg_risk_Y <-lmer(avg_risk_Y ~ time + (1|id), data = covid_long_final)model_random_intercept_avg_risk_N <-lmer(avg_risk_N ~ time + (1|id), data = covid_long_final)model_random_intercept_avg_rew_Y <-lmer(avg_rew_Y ~ time + (1|id), data = covid_long_final)model_random_intercept_avg_rew_N <-lmer(avg_rew_N ~ time + (1|id), data = covid_long_final)model_random_intercept_avg_low_risk_Y <-lmer(avg_low_risk_Y ~ time + (1|id), data = covid_long_final)model_random_intercept_avg_low_risk_N <-lmer(avg_low_risk_N ~ time + (1|id), data = covid_long_final)model_random_intercept_avg_low_rew_Y <-lmer(avg_low_rew_Y ~ time + (1|id), data = covid_long_final)model_random_intercept_avg_low_rew_N <-lmer(avg_low_rew_N ~ time + (1|id), data = covid_long_final)model_random_intercept_avg_moderate_risk_Y <-lmer(avg_moderate_risk_Y ~ time + (1|id), data = covid_long_final)model_random_intercept_avg_moderate_risk_N <-lmer(avg_moderate_risk_N ~ time + (1|id), data = covid_long_final)model_random_intercept_avg_moderate_rew_Y <-lmer(avg_moderate_rew_Y ~ time + (1|id), data = covid_long_final)model_random_intercept_avg_moderate_rew_N <-lmer(avg_moderate_rew_N ~ time + (1|id), data = covid_long_final)model_random_intercept_avg_high_risk_Y <-lmer(avg_high_risk_Y ~ time + (1|id), data = covid_long_final)model_random_intercept_avg_high_risk_N <-lmer(avg_high_risk_N ~ time + (1|id), data = covid_long_final)model_random_intercept_avg_high_rew_Y <-lmer(avg_high_rew_Y ~ time + (1|id), data = covid_long_final)model_random_intercept_avg_high_rew_N <-lmer(avg_high_rew_N ~ time + (1|id), data = covid_long_final)# Creating a grouped table with formatted layout including all modelstab_model(title ="Random Intercept Model Results - Risk/Reward Behavior Sums & Averages", model_random_intercept_num_beh_Y_sum, model_random_intercept_num_beh_N_sum, model_random_intercept_avg_risk_Y, model_random_intercept_avg_risk_N, model_random_intercept_avg_rew_Y, model_random_intercept_avg_rew_N, model_random_intercept_avg_low_risk_Y, model_random_intercept_avg_low_risk_N, model_random_intercept_avg_low_rew_Y, model_random_intercept_avg_low_rew_N, model_random_intercept_avg_moderate_risk_Y, model_random_intercept_avg_moderate_risk_N, model_random_intercept_avg_moderate_rew_Y, model_random_intercept_avg_moderate_rew_N, model_random_intercept_avg_high_risk_Y, model_random_intercept_avg_high_risk_N, model_random_intercept_avg_high_rew_Y, model_random_intercept_avg_high_rew_N)
Random Intercept Model Results - Risk/Reward Behavior Sums & Averages
yes_counts
no_counts
avg_risk_Y
avg_risk_N
avg_rew_Y
avg_rew_N
avg_low_risk_Y
avg_low_risk_N
avg_low_rew_Y
avg_low_rew_N
avg_moderate_risk_Y
avg_moderate_risk_N
avg_moderate_rew_Y
avg_moderate_rew_N
avg_high_risk_Y
avg_high_risk_N
avg_high_rew_Y
avg_high_rew_N
Predictors
Estimates
CI
p
Estimates
CI
p
Estimates
CI
p
Estimates
CI
p
Estimates
CI
p
Estimates
CI
p
Estimates
CI
p
Estimates
CI
p
Estimates
CI
p
Estimates
CI
p
Estimates
CI
p
Estimates
CI
p
Estimates
CI
p
Estimates
CI
p
Estimates
CI
p
Estimates
CI
p
Estimates
CI
p
Estimates
CI
p
(Intercept)
7.16
6.99 – 7.33
<0.001
16.26
15.74 – 16.77
<0.001
37.41
36.56 – 38.27
<0.001
60.30
59.26 – 61.35
<0.001
52.74
51.92 – 53.55
<0.001
51.75
50.80 – 52.70
<0.001
31.29
30.43 – 32.14
<0.001
31.84
30.73 – 32.94
<0.001
48.06
47.21 – 48.92
<0.001
50.57
49.37 – 51.77
<0.001
45.49
44.35 – 46.63
<0.001
58.50
57.46 – 59.53
<0.001
55.68
54.43 – 56.92
<0.001
53.22
52.26 – 54.19
<0.001
51.31
49.88 – 52.74
<0.001
72.59
71.50 – 73.69
<0.001
68.85
67.62 – 70.08
<0.001
51.54
50.44 – 52.65
<0.001
time
-0.15
-0.18 – -0.13
<0.001
-0.82
-0.92 – -0.73
<0.001
-0.12
-0.25 – 0.01
0.079
-0.73
-0.90 – -0.56
<0.001
0.31
0.17 – 0.45
<0.001
0.62
0.44 – 0.80
<0.001
0.11
-0.02 – 0.25
0.093
0.44
0.21 – 0.68
<0.001
0.33
0.18 – 0.48
<0.001
0.35
0.07 – 0.64
0.016
-0.26
-0.48 – -0.04
0.019
-0.63
-0.79 – -0.47
<0.001
0.10
-0.18 – 0.37
0.480
0.53
0.33 – 0.73
<0.001
-0.59
-0.87 – -0.31
<0.001
-0.90
-1.07 – -0.74
<0.001
0.49
0.21 – 0.78
0.001
0.96
0.76 – 1.17
<0.001
Random Effects
σ2
4.55
47.58
90.85
125.59
104.29
148.03
94.11
185.35
124.34
289.88
188.35
94.86
307.94
145.41
205.18
100.25
222.37
166.53
τ00
9.06 id
81.45 id
258.74 id
351.83 id
216.35 id
252.16 id
252.92 id
293.94 id
229.86 id
281.97 id
355.16 id
334.19 id
350.21 id
233.63 id
441.34 id
398.22 id
255.18 id
347.90 id
ICC
0.67
0.63
0.74
0.74
0.67
0.63
0.73
0.61
0.65
0.49
0.65
0.78
0.53
0.62
0.68
0.80
0.53
0.68
N
1754 id
1754 id
1748 id
1655 id
1748 id
1655 id
1744 id
1513 id
1744 id
1513 id
1601 id
1537 id
1601 id
1537 id
1292 id
1582 id
1292 id
1582 id
Observations
6438
6438
6404
5385
6404
5385
6379
4343
6379
4343
4846
4744
4847
4744
3404
4994
3404
4994
Marginal R2 / Conditional R2
0.007 / 0.668
0.021 / 0.639
0.000 / 0.740
0.005 / 0.738
0.001 / 0.675
0.004 / 0.632
0.000 / 0.729
0.002 / 0.614
0.001 / 0.649
0.001 / 0.494
0.001 / 0.654
0.004 / 0.780
0.000 / 0.532
0.003 / 0.618
0.002 / 0.683
0.007 / 0.800
0.002 / 0.535
0.007 / 0.679
Linear and quadratic growth models are fitted to explore longitudinal trajectories and capture curvature in time trends of the dataset, incorporating both fixed effects of time and random effects of individual trajectories. The outcomes of these models are presented in structured tables.
Code
# Fit linear growth model with random intercepts and slopes and homoscedastic level-1 residualsmodel_random_slope_num_beh_Y_sum <-lmer(yes_counts ~ time + (1+ time|id), data = covid_long_final, REML =FALSE, control=lmerControl(optimizer="bobyqa",optCtrl=list(maxfun=2e5)))model_random_slope_num_beh_N_sum <-lmer(no_counts ~ time + (1+ time|id), data = covid_long_final, REML =FALSE, control=lmerControl(optimizer="bobyqa",optCtrl=list(maxfun=2e5)))model_random_slope_avg_risk_Y <-lmer(avg_risk_Y ~ time + (1+ time|id), data = covid_long_final, REML =FALSE, control =lmerControl(optimizer="bobyqa", optCtrl =list(maxfun =2e5)))model_random_slope_avg_risk_N <-lmer(avg_risk_N ~ time + (1+ time|id), data = covid_long_final, REML =FALSE, control =lmerControl(optimizer="bobyqa", optCtrl =list(maxfun =2e5)))model_random_slope_avg_rew_Y <-lmer(avg_rew_Y ~ time + (1+ time|id), data = covid_long_final, REML =FALSE, control =lmerControl(optimizer="bobyqa", optCtrl =list(maxfun =2e5)))model_random_slope_avg_rew_N <-lmer(avg_rew_N ~ time + (1+ time|id), data = covid_long_final, REML =FALSE, control =lmerControl(optimizer="bobyqa", optCtrl =list(maxfun =2e5)))model_random_slope_avg_low_risk_Y <-lmer(avg_low_risk_Y ~ time + (1+ time|id), data = covid_long_final, REML =FALSE, control =lmerControl(optimizer="bobyqa", optCtrl =list(maxfun =2e5)))model_random_slope_avg_low_risk_N <-lmer(avg_low_risk_N ~ time + (1+ time|id), data = covid_long_final, REML =FALSE, control =lmerControl(optimizer="bobyqa", optCtrl =list(maxfun =2e5)))model_random_slope_avg_low_rew_Y <-lmer(avg_low_rew_Y ~ time + (1+ time|id), data = covid_long_final, REML =FALSE, control =lmerControl(optimizer="bobyqa", optCtrl =list(maxfun =2e5)))model_random_slope_avg_low_rew_N <-lmer(avg_low_rew_N ~ time + (1+ time|id), data = covid_long_final, REML =FALSE, control =lmerControl(optimizer="bobyqa", optCtrl =list(maxfun =2e5)))model_random_slope_avg_moderate_risk_Y <-lmer(avg_moderate_risk_Y ~ time + (1+ time|id), data = covid_long_final, REML =FALSE, control =lmerControl(optimizer="bobyqa", optCtrl =list(maxfun =2e5)))model_random_slope_avg_moderate_risk_N <-lmer(avg_moderate_risk_N ~ time + (1+ time|id), data = covid_long_final, REML =FALSE, control =lmerControl(optimizer="bobyqa", optCtrl =list(maxfun =2e5)))model_random_slope_avg_moderate_rew_Y <-lmer(avg_moderate_rew_Y ~ time + (1+ time|id), data = covid_long_final, REML =FALSE, control =lmerControl(optimizer="bobyqa", optCtrl =list(maxfun =2e5)))model_random_slope_avg_moderate_rew_N <-lmer(avg_moderate_rew_N ~ time + (1+ time|id), data = covid_long_final, REML =FALSE, control =lmerControl(optimizer="bobyqa", optCtrl =list(maxfun =2e5)))model_random_slope_avg_high_risk_Y <-lmer(avg_high_risk_Y ~ time + (1+ time|id), data = covid_long_final, REML =FALSE, control =lmerControl(optimizer="bobyqa", optCtrl =list(maxfun =2e5)))model_random_slope_avg_high_risk_N <-lmer(avg_high_risk_N ~ time + (1+ time|id), data = covid_long_final, REML =FALSE, control =lmerControl(optimizer="bobyqa", optCtrl =list(maxfun =2e5)))model_random_slope_avg_high_rew_Y <-lmer(avg_high_rew_Y ~ time + (1+ time|id), data = covid_long_final, REML =FALSE, control =lmerControl(optimizer="bobyqa", optCtrl =list(maxfun =2e5)))model_random_slope_avg_high_rew_N <-lmer(avg_high_rew_N ~ time + (1+ time|id), data = covid_long_final, REML =FALSE, control =lmerControl(optimizer="bobyqa", optCtrl =list(maxfun =2e5)))# Creating a grouped table with formatted layout including all modelstab_model(title ="Random Linear Growth Model Results - Risk/Reward Behavior Sums & Averages", model_random_slope_num_beh_Y_sum, model_random_slope_num_beh_N_sum, model_random_slope_avg_risk_Y, model_random_slope_avg_risk_N, model_random_slope_avg_rew_Y, model_random_slope_avg_rew_N, model_random_slope_avg_low_risk_Y, model_random_slope_avg_low_risk_N, model_random_slope_avg_low_rew_Y, model_random_slope_avg_low_rew_N, model_random_slope_avg_moderate_risk_Y, model_random_slope_avg_moderate_risk_N, model_random_slope_avg_moderate_rew_Y, model_random_slope_avg_moderate_rew_N, model_random_slope_avg_high_risk_Y, model_random_slope_avg_high_risk_N, model_random_slope_avg_high_rew_Y, model_random_slope_avg_high_rew_N)
Random Linear Growth Model Results - Risk/Reward Behavior Sums & Averages
yes_counts
no_counts
avg_risk_Y
avg_risk_N
avg_rew_Y
avg_rew_N
avg_low_risk_Y
avg_low_risk_N
avg_low_rew_Y
avg_low_rew_N
avg_moderate_risk_Y
avg_moderate_risk_N
avg_moderate_rew_Y
avg_moderate_rew_N
avg_high_risk_Y
avg_high_risk_N
avg_high_rew_Y
avg_high_rew_N
Predictors
Estimates
CI
p
Estimates
CI
p
Estimates
CI
p
Estimates
CI
p
Estimates
CI
p
Estimates
CI
p
Estimates
CI
p
Estimates
CI
p
Estimates
CI
p
Estimates
CI
p
Estimates
CI
p
Estimates
CI
p
Estimates
CI
p
Estimates
CI
p
Estimates
CI
p
Estimates
CI
p
Estimates
CI
p
Estimates
CI
p
(Intercept)
7.17
7.00 – 7.35
<0.001
16.20
15.67 – 16.73
<0.001
37.36
36.51 – 38.21
<0.001
60.21
59.16 – 61.26
<0.001
52.78
51.97 – 53.59
<0.001
51.80
50.85 – 52.75
<0.001
31.26
30.42 – 32.10
<0.001
31.83
30.71 – 32.94
<0.001
48.07
47.22 – 48.92
<0.001
50.62
49.41 – 51.82
<0.001
45.39
44.27 – 46.52
<0.001
58.47
57.43 – 59.51
<0.001
55.75
54.49 – 57.00
<0.001
53.36
52.38 – 54.35
<0.001
51.25
49.78 – 52.72
<0.001
72.53
71.42 – 73.63
<0.001
68.86
67.62 – 70.10
<0.001
51.55
50.43 – 52.66
<0.001
time
-0.17
-0.20 – -0.13
<0.001
-0.80
-0.92 – -0.69
<0.001
-0.09
-0.24 – 0.06
0.251
-0.70
-0.91 – -0.50
<0.001
0.29
0.13 – 0.45
<0.001
0.61
0.39 – 0.82
<0.001
0.13
-0.02 – 0.28
0.093
0.44
0.17 – 0.71
0.001
0.33
0.16 – 0.50
<0.001
0.35
0.03 – 0.68
0.034
-0.21
-0.46 – 0.03
0.089
-0.63
-0.82 – -0.43
<0.001
0.06
-0.25 – 0.38
0.701
0.46
0.22 – 0.70
<0.001
-0.55
-0.87 – -0.23
0.001
-0.88
-1.08 – -0.67
<0.001
0.49
0.19 – 0.78
0.001
0.98
0.74 – 1.22
<0.001
Random Effects
σ2
4.10
40.45
83.43
113.93
96.94
134.52
88.36
167.91
116.50
265.00
177.20
84.88
281.48
125.81
184.16
85.52
216.45
150.98
τ00
10.11 id
89.45 id
255.04 id
356.81 id
216.39 id
254.36 id
243.52 id
307.03 id
228.07 id
294.51 id
346.04 id
337.52 id
369.84 id
251.69 id
481.98 id
405.42 id
262.27 id
354.75 id
τ11
0.12 id.time
1.74 id.time
1.80 id.time
2.91 id.time
1.80 id.time
3.41 id.time
1.40 id.time
4.39 id.time
1.91 id.time
6.20 id.time
2.64 id.time
2.53 id.time
6.31 id.time
5.01 id.time
5.12 id.time
3.82 id.time
1.37 id.time
3.90 id.time
ρ01
-0.37 id
-0.31 id
-0.06 id
-0.13 id
-0.11 id
-0.15 id
0.03 id
-0.22 id
-0.09 id
-0.24 id
-0.02 id
-0.12 id
-0.24 id
-0.28 id
-0.31 id
-0.15 id
-0.17 id
-0.16 id
ICC
0.69
0.69
0.76
0.76
0.70
0.67
0.75
0.65
0.67
0.54
0.68
0.80
0.57
0.67
0.71
0.83
0.55
0.71
N
1754 id
1754 id
1748 id
1655 id
1748 id
1655 id
1744 id
1513 id
1744 id
1513 id
1601 id
1537 id
1601 id
1537 id
1292 id
1582 id
1292 id
1582 id
Observations
6438
6438
6404
5385
6404
5385
6379
4343
6379
4343
4846
4744
4847
4744
3404
4994
3404
4994
Marginal R2 / Conditional R2
0.009 / 0.697
0.021 / 0.693
0.000 / 0.762
0.004 / 0.763
0.001 / 0.699
0.004 / 0.666
0.000 / 0.747
0.002 / 0.650
0.001 / 0.673
0.001 / 0.537
0.000 / 0.676
0.004 / 0.804
0.000 / 0.574
0.002 / 0.670
0.002 / 0.714
0.006 / 0.831
0.002 / 0.547
0.008 / 0.710
Code
# Quadratic model with random intercept and random slopemodel_quad_num_beh_Y_sum <-lmer(yes_counts ~ time +I(time^2) + (1+ time +I(time^2)|id), data = covid_long_final, REML =FALSE, control =lmerControl(optimizer ="bobyqa", optCtrl =list(maxfun=2e5)))model_quad_num_beh_N_sum <-lmer(no_counts ~ time +I(time^2) + (1+ time +I(time^2)|id), data = covid_long_final, REML =FALSE, control =lmerControl(optimizer ="bobyqa", optCtrl =list(maxfun=2e5)))model_quad_avg_risk_Y <-lmer(avg_risk_Y ~ time +I(time^2) + (1+ time +I(time^2)|id), data = covid_long_final, REML =FALSE, control =lmerControl(optimizer ="bobyqa", optCtrl =list(maxfun =2e5)))model_quad_avg_risk_N <-lmer(avg_risk_N ~ time +I(time^2) + (1+ time +I(time^2)|id), data = covid_long_final, REML =FALSE, control =lmerControl(optimizer ="bobyqa", optCtrl =list(maxfun =2e5)))model_quad_avg_rew_Y <-lmer(avg_rew_Y ~ time +I(time^2) + (1+ time +I(time^2)|id), data = covid_long_final, REML =FALSE, control =lmerControl(optimizer ="bobyqa", optCtrl =list(maxfun =2e5)))model_quad_avg_rew_N <-lmer(avg_rew_N ~ time +I(time^2) + (1+ time +I(time^2)|id), data = covid_long_final, REML =FALSE, control =lmerControl(optimizer ="bobyqa", optCtrl =list(maxfun =2e5)))model_quad_avg_low_risk_Y <-lmer(avg_low_risk_Y ~ time +I(time^2) + (1+ time +I(time^2)|id), data = covid_long_final, REML =FALSE, control =lmerControl(optimizer="bobyqa", optCtrl =list(maxfun =2e5)))#model_quad_avg_low_risk_N <- lmer(avg_low_risk_N ~ time + I(time^2) + (1 + time + I(time^2)|id), data = covid_long_final, REML = FALSE, control = lmerControl(optimizer="bobyqa", optCtrl = list(maxfun = 2e5)))model_quad_avg_low_rew_Y <-lmer(avg_low_rew_Y ~ time +I(time^2) + (1+ time +I(time^2)|id), data = covid_long_final, REML =FALSE, control =lmerControl(optimizer="bobyqa", optCtrl =list(maxfun =2e5)))#model_quad_avg_low_rew_N <- lmer(avg_low_rew_N ~ time + I(time^2) + (1 + time + I(time^2)|id), data = covid_long_final, REML = FALSE, control = lmerControl(optimizer="bobyqa", optCtrl = list(maxfun = 2e5)))model_quad_avg_moderate_risk_Y <-lmer(avg_moderate_risk_Y ~ time +I(time^2) + (1+ time +I(time^2)|id), data = covid_long_final, REML =FALSE, control =lmerControl(optimizer="bobyqa", optCtrl =list(maxfun =2e5)))model_quad_avg_moderate_risk_N <-lmer(avg_moderate_risk_N ~ time +I(time^2) + (1+ time +I(time^2)|id), data = covid_long_final, REML =FALSE, control =lmerControl(optimizer="bobyqa", optCtrl =list(maxfun =2e5)))model_quad_avg_moderate_rew_Y <-lmer(avg_moderate_rew_Y ~ time +I(time^2) + (1+ time +I(time^2)|id), data = covid_long_final, REML =FALSE, control =lmerControl(optimizer="bobyqa", optCtrl =list(maxfun =2e5)))model_quad_avg_moderate_rew_N <-lmer(avg_moderate_rew_N ~ time +I(time^2) + (1+ time +I(time^2)|id), data = covid_long_final, REML =FALSE, control =lmerControl(optimizer="bobyqa", optCtrl =list(maxfun =2e5)))#model_quad_avg_high_risk_Y <- lmer(avg_high_risk_Y ~ time + I(time^2) + (1 + time + I(time^2)|id), data = covid_long_final, REML = FALSE, control = lmerControl(optimizer="bobyqa", optCtrl = list(maxfun = 2e5)))model_quad_avg_high_risk_N <-lmer(avg_high_risk_N ~ time +I(time^2) + (1+ time +I(time^2)|id), data = covid_long_final, REML =FALSE, control =lmerControl(optimizer="bobyqa", optCtrl =list(maxfun =2e5)))#model_quad_avg_high_rew_Y <- lmer(avg_high_rew_Y ~ time + I(time^2) + (1 + time + I(time^2)|id), data = covid_long_final, REML = FALSE, control = lmerControl(optimizer="bobyqa", optCtrl = list(maxfun = 2e5)))model_quad_avg_high_rew_N <-lmer(avg_high_rew_N ~ time +I(time^2) + (1+ time +I(time^2)|id), data = covid_long_final, REML =FALSE, control =lmerControl(optimizer="bobyqa", optCtrl =list(maxfun =2e5)))# Creating a grouped table with formatted layout including all modelstab_model(title ="Random Quadratic Growth Model Results - Risk/Reward Behavior Sums & Averages", model_quad_num_beh_Y_sum, model_quad_num_beh_N_sum, model_quad_avg_risk_Y, model_quad_avg_risk_N, model_quad_avg_rew_Y, model_quad_avg_rew_N, model_quad_avg_low_risk_Y,#model_quad_avg_low_risk_N, model_quad_avg_low_rew_Y,#model_quad_avg_low_rew_N, model_quad_avg_moderate_risk_Y, model_quad_avg_moderate_risk_N, model_quad_avg_moderate_rew_Y, model_quad_avg_moderate_rew_N,#model_quad_avg_high_risk_Y, model_quad_avg_high_risk_N,#model_quad_avg_high_rew_Y, model_quad_avg_high_rew_N)
Random Quadratic Growth Model Results - Risk/Reward Behavior Sums & Averages
yes_counts
no_counts
avg_risk_Y
avg_risk_N
avg_rew_Y
avg_rew_N
avg_low_risk_Y
avg_low_rew_Y
avg_moderate_risk_Y
avg_moderate_risk_N
avg_moderate_rew_Y
avg_moderate_rew_N
avg_high_risk_N
avg_high_rew_N
Predictors
Estimates
CI
p
Estimates
CI
p
Estimates
CI
p
Estimates
CI
p
Estimates
CI
p
Estimates
CI
p
Estimates
CI
p
Estimates
CI
p
Estimates
CI
p
Estimates
CI
p
Estimates
CI
p
Estimates
CI
p
Estimates
CI
p
Estimates
CI
p
(Intercept)
7.43
7.24 – 7.61
<0.001
16.38
15.82 – 16.94
<0.001
36.96
36.09 – 37.83
<0.001
59.99
58.91 – 61.08
<0.001
52.92
52.08 – 53.77
<0.001
51.54
50.54 – 52.55
<0.001
30.71
29.85 – 31.58
<0.001
47.95
47.07 – 48.84
<0.001
44.64
43.47 – 45.82
<0.001
58.09
57.01 – 59.16
<0.001
56.05
54.74 – 57.36
<0.001
53.28
52.25 – 54.31
<0.001
72.22
71.08 – 73.36
<0.001
51.09
49.91 – 52.27
<0.001
time
-0.63
-0.73 – -0.53
<0.001
-1.11
-1.45 – -0.76
<0.001
0.61
0.16 – 1.07
0.008
-0.34
-0.93 – 0.24
0.249
0.04
-0.43 – 0.51
0.877
1.06
0.44 – 1.68
0.001
1.08
0.61 – 1.55
<0.001
0.53
0.01 – 1.04
0.044
1.20
0.46 – 1.94
0.002
0.04
-0.51 – 0.59
0.877
-0.48
-1.40 – 0.44
0.308
0.60
-0.04 – 1.25
0.066
-0.30
-0.85 – 0.26
0.292
1.82
1.10 – 2.55
<0.001
time^2
0.08
0.07 – 0.10
<0.001
0.06
0.00 – 0.11
0.037
-0.13
-0.20 – -0.05
0.001
-0.06
-0.16 – 0.03
0.193
0.04
-0.03 – 0.12
0.244
-0.08
-0.18 – 0.02
0.110
-0.17
-0.24 – -0.09
<0.001
-0.04
-0.12 – 0.05
0.399
-0.25
-0.38 – -0.13
<0.001
-0.12
-0.21 – -0.03
0.008
0.10
-0.05 – 0.25
0.204
-0.03
-0.13 – 0.08
0.627
-0.10
-0.19 – -0.01
0.023
-0.15
-0.26 – -0.04
0.008
Random Effects
σ2
3.74
36.43
79.27
109.32
95.51
130.07
82.59
114.69
169.59
80.90
274.37
122.48
81.12
140.86
τ00
11.04 id
95.27 id
250.56 id
359.04 id
212.49 id
259.04 id
241.21 id
222.00 id
344.29 id
338.30 id
345.65 id
248.71 id
411.22 id
369.69 id
τ11
0.98 id.time
14.63 id.time
12.46 id.time
16.02 id.time
7.63 id.time
18.30 id.time
15.59 id.time
9.11 id.time
20.37 id.time
13.89 id.time
23.33 id.time
14.83 id.time
18.44 id.time
37.25 id.time
0.02 id.I(time^2)
0.21 id.I(time^2)
0.21 id.I(time^2)
0.23 id.I(time^2)
0.06 id.I(time^2)
0.16 id.I(time^2)
0.29 id.I(time^2)
0.08 id.I(time^2)
0.34 id.I(time^2)
0.20 id.I(time^2)
0.33 id.I(time^2)
0.15 id.I(time^2)
0.25 id.I(time^2)
0.48 id.I(time^2)
ρ01
-0.47
-0.32
0.02
-0.12
-0.02
-0.17
-0.00
0.02
-0.01
-0.11
0.08
-0.15
-0.18
-0.24
0.43
0.21
-0.10
0.04
-0.15
0.09
-0.04
-0.21
-0.06
0.03
-0.40
-0.09
0.12
0.20
ICC
0.72
0.72
0.77
0.77
0.76
0.69
0.81
0.58
0.68
0.84
N
1754 id
1754 id
1748 id
1655 id
1748 id
1655 id
1744 id
1744 id
1601 id
1537 id
1601 id
1537 id
1582 id
1582 id
Observations
6438
6438
6404
5385
6404
5385
6379
6379
4846
4744
4847
4744
4994
4994
Marginal R2 / Conditional R2
0.015 / 0.721
0.020 / 0.723
0.001 / 0.775
0.004 / 0.773
0.004 / NA
0.012 / NA
0.001 / 0.764
0.004 / NA
0.002 / 0.691
0.004 / 0.813
0.000 / 0.585
0.002 / 0.679
0.006 / 0.838
0.029 / NA
Finally, visualizations of the predicted trajectories from these models are generated, using polynomial regression lines to depict dynamic changes and individual variability over time.
Code
# Plot model implied trajectories, quadraticggplot(covid_long_final,aes(time,predict(model_quad_num_beh_Y_sum))) +ylim(0, 40) +geom_smooth(aes(group=id),method='lm',formula=y~x+I(x^2),se=FALSE,color="black",size=.1) +labs(x ="Time Point") +labs(y ="Predicted Sum Number of 'Yes' Behaviors")
Code
ggplot(covid_long_final,aes(time,predict(model_quad_num_beh_N_sum))) +ylim(0, 40) +geom_smooth(aes(group=id),method='lm',formula=y~x+I(x^2),se=FALSE,color="black",size=.1) +labs(x ="Time Point") +labs(y ="Predicted Sum Number of 'No' Behaviors")
Code
# Plot model implied trajectories, linearggplot(covid_long_final, aes(x = time, y =predict(model_random_slope_num_beh_Y_sum))) +geom_smooth(aes(group = id), method ='lm', formula = y ~ x, se =FALSE, color ="black", size = .1) +labs(x ="Time Point", y ="Predicted Sum Number of 'Yes' Behaviors") +ylim(0, 40)
Code
ggplot(covid_long_final, aes(x = time, y =predict(model_random_slope_num_beh_N_sum))) +geom_smooth(aes(group = id), method ='lm', formula = y ~ x, se =FALSE, color ="black", size = .1) +labs(x ="Time Point", y ="Predicted Sum Number of 'No' Behaviors") +ylim(0, 40)
Code
### Imputation code--needed as the predicted values has fewer rows than the original data frame# Calculate predictions for each modelquad_predicted_avg_risk_Y <-predict(model_quad_avg_risk_Y, re.form =NA)quad_predicted_avg_risk_N <-predict(model_quad_avg_risk_N, re.form =NA)quad_predicted_avg_rew_Y <-predict(model_quad_avg_rew_Y, re.form =NA)quad_predicted_avg_rew_N <-predict(model_quad_avg_rew_N, re.form =NA)lin_predicted_avg_risk_Y <-predict(model_random_slope_avg_risk_Y, re.form =NA)lin_predicted_avg_risk_N <-predict(model_random_slope_avg_risk_N, re.form =NA)lin_predicted_avg_rew_Y <-predict(model_random_slope_avg_rew_Y, re.form =NA)lin_predicted_avg_rew_N <-predict(model_random_slope_avg_rew_N, re.form =NA)# Add predicted values to the dataframe, initializing with NAcovid_long_final$quad_predicted_avg_risk_Y <-NAcovid_long_final$quad_predicted_avg_risk_N <-NAcovid_long_final$quad_predicted_avg_rew_Y <-NAcovid_long_final$quad_predicted_avg_rew_N <-NAcovid_long_final$lin_predicted_avg_risk_Y <-NAcovid_long_final$lin_predicted_avg_risk_N <-NAcovid_long_final$lin_predicted_avg_rew_Y <-NAcovid_long_final$lin_predicted_avg_rew_N <-NA# Fill in the predicted values where availablecovid_long_final$quad_predicted_avg_risk_Y[!is.na(quad_predicted_avg_risk_Y)] <- quad_predicted_avg_risk_Ycovid_long_final$quad_predicted_avg_risk_N[!is.na(quad_predicted_avg_risk_N)] <- quad_predicted_avg_risk_Ncovid_long_final$quad_predicted_avg_rew_Y[!is.na(quad_predicted_avg_rew_Y)] <- quad_predicted_avg_rew_Ycovid_long_final$quad_predicted_avg_rew_N[!is.na(quad_predicted_avg_rew_N)] <- quad_predicted_avg_rew_Ncovid_long_final$lin_predicted_avg_risk_Y[!is.na(lin_predicted_avg_risk_Y)] <- lin_predicted_avg_risk_Ycovid_long_final$lin_predicted_avg_risk_N[!is.na(lin_predicted_avg_risk_N)] <- lin_predicted_avg_risk_Ncovid_long_final$lin_predicted_avg_rew_Y[!is.na(lin_predicted_avg_rew_Y)] <- lin_predicted_avg_rew_Ycovid_long_final$lin_predicted_avg_rew_N[!is.na(lin_predicted_avg_rew_N)] <- lin_predicted_avg_rew_N# Handle NA predicted values by assigning the mean of the available predictionsquad_mean_predicted_risk_Y <-mean(quad_predicted_avg_risk_Y, na.rm =TRUE)quad_mean_predicted_risk_N <-mean(quad_predicted_avg_risk_N, na.rm =TRUE)quad_mean_predicted_rew_Y <-mean(quad_predicted_avg_rew_Y, na.rm =TRUE)quad_mean_predicted_rew_N <-mean(quad_predicted_avg_rew_N, na.rm =TRUE)lin_mean_predicted_risk_Y <-mean(lin_predicted_avg_risk_Y, na.rm =TRUE)lin_mean_predicted_risk_N <-mean(lin_predicted_avg_risk_N, na.rm =TRUE)lin_mean_predicted_rew_Y <-mean(lin_predicted_avg_rew_Y, na.rm =TRUE)lin_mean_predicted_rew_N <-mean(lin_predicted_avg_rew_N, na.rm =TRUE)covid_long_final$quad_predicted_avg_risk_Y[is.na(covid_long_final$quad_predicted_avg_risk_Y)] <- quad_mean_predicted_risk_Ycovid_long_final$quad_predicted_avg_risk_N[is.na(covid_long_final$quad_predicted_avg_risk_N)] <- quad_mean_predicted_risk_Ncovid_long_final$quad_predicted_avg_rew_Y[is.na(covid_long_final$quad_predicted_avg_rew_Y)] <- quad_mean_predicted_rew_Ycovid_long_final$quad_predicted_avg_rew_N[is.na(covid_long_final$quad_predicted_avg_rew_N)] <- quad_mean_predicted_rew_Ncovid_long_final$lin_predicted_avg_risk_Y[is.na(covid_long_final$lin_predicted_avg_risk_Y)] <- lin_mean_predicted_risk_Ycovid_long_final$lin_predicted_avg_risk_N[is.na(covid_long_final$lin_predicted_avg_risk_N)] <- lin_mean_predicted_risk_Ncovid_long_final$lin_predicted_avg_rew_Y[is.na(covid_long_final$lin_predicted_avg_rew_Y)] <- lin_mean_predicted_rew_Ycovid_long_final$lin_predicted_avg_rew_N[is.na(covid_long_final$lin_predicted_avg_rew_N)] <- lin_mean_predicted_rew_N# Plotting all four graphs, quadraticq1 <-ggplot(covid_long_final, aes(x = time, y = quad_predicted_avg_risk_Y)) +geom_smooth(aes(group = id), method ='lm', formula = y ~ x +I(x^2), se =FALSE, color ="black", size = .1) +labs(x ="Time Point", y ="Predicted Average 'Yes' Risk Score")q2 <-ggplot(covid_long_final, aes(x = time, y = quad_predicted_avg_risk_N)) +geom_smooth(aes(group = id), method ='lm', formula = y ~ x +I(x^2), se =FALSE, color ="black", size = .1) +labs(x ="Time Point", y ="Predicted Average 'No' Risk Score")q3 <-ggplot(covid_long_final, aes(x = time, y = quad_predicted_avg_rew_Y)) +geom_smooth(aes(group = id), method ='lm', formula = y ~ x +I(x^2), se =FALSE, color ="black", size = .1) +labs(x ="Time Point", y ="Predicted Average 'Yes' Reward Score")q4 <-ggplot(covid_long_final, aes(x = time, y = quad_predicted_avg_rew_N)) +geom_smooth(aes(group = id), method ='lm', formula = y ~ x +I(x^2), se =FALSE, color ="black", size = .1) +labs(x ="Time Point", y ="Predicted Average 'No' Reward Score")# Combine and display the plotsgrid.arrange(q1, q2, ncol =2)
Code
grid.arrange(q3, q4, ncol =2)
Code
# Plotting all four graphs, linearl1 <-ggplot(covid_long_final, aes(x = time, y = lin_predicted_avg_risk_Y)) +geom_smooth(aes(group = id), method ='lm', formula = y ~ x, se =FALSE, color ="black", size = .1) +labs(x ="Time Point", y ="Predicted Average 'Yes' Risk Score")l2 <-ggplot(covid_long_final, aes(x = time, y = lin_predicted_avg_risk_N)) +geom_smooth(aes(group = id), method ='lm', formula = y ~ x, se =FALSE, color ="black", size = .1) +labs(x ="Time Point", y ="Predicted Average 'No' Risk Score")l3 <-ggplot(covid_long_final, aes(x = time, y = lin_predicted_avg_rew_Y)) +geom_smooth(aes(group = id), method ='lm', formula = y ~ x, se =FALSE, color ="black", size = .1) +labs(x ="Time Point", y ="Predicted Average 'Yes' Reward Score")l4 <-ggplot(covid_long_final, aes(x = time, y = lin_predicted_avg_rew_N)) +geom_smooth(aes(group = id), method ='lm', formula = y ~ x, se =FALSE, color ="black", size = .1) +labs(x ="Time Point", y ="Predicted Average 'No' Reward Score")# Combine and display the plotsgrid.arrange(l1, l2, ncol =2)
Code
grid.arrange(l3, l4, ncol =2)
Session Information
To enhance reproducibility, details about the working environment used for these analyses can be found below.